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PREFACE 

The  first  RADC/ARPA-sponsored  Invitational  DO D/ Industry  Conference 
on  Software  Verification  and  Validation  was  Ixnld  can  August  3,4,5, 
1976  at  the  Syracuse  hotel,  Syracuse  New  York.  This  volume 
contains  presentations  made  at  the  conference.  A word  of  thanks  is 
expressed  to  all  paricipants,  and  , in  particular,  to  the  speakers 
vinose  presentations  are  oontained  in  this  volume  for  contributing 
to  tine  conference's  success. 

A special  tlianks  is  given  to  Col  Robert  Krutz,  the  Information 
Sciences  Division  Chief  of  Rome  Air  Development  Center  for  his 
Welcoming  speech  which  indicated  the  philosonhies  and  predicted 
directions  the  Information  Sciences  are  headed. 

Thanks  is  also  due  to  Ms.  Deborah  Danti  and  Mr.  Stanley  Borah,  for 
successfully  operating  tine  registration  desk,  and  working  long 
hours  behind  the  scenes  tliat  kept  the  conference  running  smoothly. 

This  document  is  not  printed  at  government  expense. 
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INTRODUCTION 

This  paper  was  originally  prepared  as  a portion  of  MITRE 
Technical  Report,  MTR-3232,  SOFTWARE  DEVELOPMENT  REQUIREMENTS  FOR 
COMMAND  AND  CONTROL,  in  support  of  the  Technological  Planning 
activity  in  the  Electronic  Systems  Division  Development  Plans 
Deputate.  That  report,  not  released  in  its  entirety  for  public 
distribution,  is  intended  to  provide  the  background  and  rationale  to 
support  the  definition  of  the  ESD  Technology  Needs  in  the  software 
area.  These  Technical  Needs  in  turn  will  provide  guidance  to  the 
Air  Force  Laboratories,  especially  the  Rome  Air  Development  Center, 
as  to  needed  technology  developments  to  rectify  deficiencies  and/or 
provide  new  capabilities  for  the  software  aspects  of  the  automatic 
data  processing  function  in  Air  Force  Command  and  Contol  Systems. 

The  words  "Verification  and  Validation"  have  become  both 
cliches  and  budget  items  in  large  system  software  programs.  These 
informal  approaches  have  in  particular  been  developed  vigorously  by 
the  aerospace  industry  (cf.  Reifer74),  where  the  practical 
necessities  of  missile  software  do  not  permit  postponement  until  the 
advent  of  more  formal  proof  techniques. 

We  shall  in  this  paper,  however,  be  concerned  only  with  the 
possibility  of  demonstrating  mathematically,  at  least  at  some  level 
of  abstraction,  that  software  enjoys  certain  properties.  Among 
these  properties  might  be,  for  example,  the  fact  that  routines, 
regarded  as  functions,  are  well  defined  over  certain  argument 
domains  or,  regarded  as  procedures,  terminate  for  a certain  class 
of  initial  values.  Most  theoretical  work,  starting  with  Floyd67, 
has  been  directed  towards  proving  the  correctness  of  programs, 
defined  in  terms  of  user-supplied  I/O  relationships.  Here,  the 
least  mechanical  part  of  the  procedure  turns  out  to  be  the 
generation  of  the  inductive  assertions  associated  with  program 
loops.  Even  tnere,  work  is  progressing  towards  providing  mechanical 
assistance  in  assertion  generation,  cf.  German75.  The  most 
impressive  system  for  the  completely  automatic  proof  of  properties 
of  programs  is  that  of  Moore73«  All  current  successes  in  the  proof 
of  program  correctness  are  restricted,  from  the  viewpoint  of  the 
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software  associated  with  Air  Force  Command  and  Control  systems,  to 
quite  small  programs. 

The  most  conservative  approaches  today  seem  to  be  the 
restriction  of  proofs  either  to  quite  simple  properties  or  to  small, 
critical  subsystems.  The  current  work  on  the  development  of 
verifiably  secure  operating  systems  for  multi-security-level  on-line 
computer  systems,  cf.  Burke74,  utilizes  both  these  approaches.  The 
property  chosen  for  verification,  the  so-called  *-property,  cf. 
LaPadula73,  is  a quite  simple  property  that  constitutes  a sufficient 
condition  that  the  system  conforms  to  the  Department  of  Defense 
security  policy,  Rush72.  Furthermore,  in  order  to  facilitate  the 
design,  the  proofs,  and  the  implementation,  all  data  references  are 
monitored  by  a "security  kernel"  that,  as  an  abstract  machine,  is 
the  sole  source  of  such  services  available  to  the  more  external 
balance  of  the  operating  system.  By  this  localization  of  all  "real" 
memory  references,  it  is  hoped  to  greatly  reduce  the  size  of  that 
portion  of  the  operating  system  for  which  the  *-property  must  be 
proved . 

Nork  in  this  direction  is  obviously  of  great  significance  to 
the  Air  Force.  Tactical  Command  and  Control  systems  must  provide 
users  with  restricted  access  privileges  to  data  on  terminals  that 
might  even  fall  (together  with  informed  users)  into  enemy  hands. 
Multi-security-level  on-line  systems  are  also  being  sought  for  the 
economic  benefits  they  provide  through  continuous  resource  sharing 
in  contrast  to  conventional,  costly  "sanitization"  procedures. 
Multi-level  secure  on-line  data  processing  or  communications  systems 
currently  operating  or  under  construction  include  the  MULTICS 
Systems  at  the  Air  Force  Data  Services  Center,  the  Security  Kernel 
Based  MULTICS  System,  the  Prototype  Secure  Front  End  Processor,  the 
PDP-11/45  Security  Kernel  Based  System  at  The  MITRE  Corporation,  and 
the  AUTODIN  system.  Those  under  development  include  SATIN  IV  and 
some  upgraded  versions  of  the  AABNCP . 

It  would  be  desirable,  if  feasible,  to  be  in  a position  to 
prove  other,  perhaps  more  complicated,  properties  of  such  multi- 
level systems.  It  is,  in  reality,  necessary  to  demonstrate  more 
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generally  that  the  integrity  of  a data  processing  system  is  assured, 
e.g.,  that  no  user  can,  in  an  unauthorized  manner,  deny  the 
legitimate  services  of  the  system  to  other  users.  Software  which 
supports  high  cost/high  risk  operations  such  as  space  missions  or 
satellite  launches  must  work  the  first  time.  For  communications 
systems,  we  might  want  to  prove  that,  at  least,  the  President's 
message  will  get  delivered  and  to  the  correct  recipient.  It  would 
be  preferable,  though  more  difficult,  to  prove  that  it  will  be 
delivered  within  a stipulated  time  interval. 

There  are  certain  areas  in  which  software  probably  cannot  be 
used  unless  we  actually  go  further  and  prove  it  more  generally 
correct.  One  example  would  be  software  that  provides  an  automatic 
strategic  weapon  allocation  capability.  Another  would  be  software 
entrusted  with  the  actual  encryption  of  messages.  Until  software 
is  demonstrated  correct,  it  probably  should  »iOt  be  employed  for  such 
critical  purposes.  Furthermore,  it  is  probably  only  such  critical 
services  that  could  justify  the  cost  of  full  correctness  proofs  in 
the  foreseeable  future. 

Mechanical  Verification 

Since  combinatorial  explosion  precludes  the  exhaustive  testing 
of  large  programs,  it  is  tautological  that  testing  can  only  reveal 
the  presence  of  bugs,  not  their  absence.  The  pressing  need  felt 
today  for  software  verification  derives,  not  from  such  logical 
considerations,  but  rather  from  bitter  experience.  All  large 
programs  have  had  their  share  of  bugs,  and  legion  are  the  anecdotes 
of  software  systems  that  have  functioned  for  several  years  before 
some  novel  utilization  revealed  fundamental  errors.  More 
significantly,  "tiger  teams"  have  been  able  to  crack  the  security 
and  to  gain  complete  control,  including  access  to  all  file3,  of  such 
current  operating  systems  as  OS  (Goheen72)  and  MULTICS  (Karger74). 

It  is  these  frightening  experiences  that  have,  in  particular,  led  to 
the  search  for  methods  of  designing  and  implementing  verifiable 
security  kernels  for  on-line  multi-security-level  computer  systems. 
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The  same  sort  of  empirical  evidence  that  dictates  a lack  of 
faith  in  non-verified  software  also  dictates  that,  with  respect  to 
matters  involving  national  security,  we  have  no  business  trusting 
manual  mathematical  proofs  as  sufficient  verification  of  properties 
of  programs.  There  is  an  equally  large  treasury  of  anecdotes 
relating  to  false  mathematical  proofs.  This  is  a history,  we  might 
point  out,  that,  unlike  the  history  of  software  errors,  extends  over 
centuries.  We  are  referring  to  submitted,  refereed,  published 
results  accepted  for  a number  of  years  before  either  they  were 
discovered  false  or  a gap  in  the  proof  was  disclosed.  To  choose  but 
one  example,  we  should  like  to  quote  from  the  introduction  of  an 
article  appearing  in  a recent  issue  of  the  Annals  of  Mathematics, 
Masur75: 


The  study  of  the  geometry  of  the  classical 
Teichmueller  spaces  was  begun  in  1959  by  Kravetz  [9]. 
The  starting  point  was  the  classical  theorem  of 
Teichmueller  on  extremal  quasiconformal  maps  be- 
tween compact  Riemann  surfaces.  The  Teichmueller 
theorem  was  used  to  argue  that  with  respect  to 
the  Teichmueller  metric,  Teichmueller  space  is 
straignt  and  that  it  has  negative  curvature. 

In  turn,  negative  curvature  was  used  to  show  that 
every  finite  subgroup  of  the  Teichmueller  modular 
group  has  a fixed  point.  This  latter  statement 
is  known  to  be  equivalent  to  the  Nielsen 
Realization  Problem  which  conjectures  that  every 
finite  subgroup  of  the  mapping  class  group  of 
a surface  can  be  realized  by  a finite  subgroup  of 
the  group  of  homeomorphisms. 

Recently  Linch  [10]  found  a mistake  in 
Kravetz 's  curvature  arguments  so  that  problem  and 
consequently  also  the  fixed  point  problem  were 
reopened.  The  main  result  in  this  paper  is  that 
Teichmueller  space  does  not  have  negative  curvature. 
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The  history  here,  then,  is  that  the  published  proof  survived 
for  twelve  years  until  it  was  discovered  faulty.  It  took  another 
four  years  before  the  result  itself  was  discovered  false. 
Unfortunately,  with  respect  to  the  validity  of  demonstrations  of 
properties  of  programs,  the  history  might  be  reversed.  A manually 
verified  security  kernel  may  first  be  demonstrated  insecure — by  an 
enemy's  cracking  it — and  only  subsequently  may  the  error  in  the 
proof  be  discovered. 

A discussion  of  the  precarious  certainty  of  mathematical  proof 
is  found  in  Davis72.  It  is  replete  with  both  histories  of  major 
errors  by  first  rank  mathematicians  and  mathematical  texts 
accompanied  by  tens  of  pages  of  errata.  We  must  further  keep 
constantly  in  mind  that  ordinary  published  mathematical  proofs 
probably  normally  range  from  ten  lines  to  twenty  pages.  The  proof 
of  so  simple  a property  as  the  security  preservation  of  the  kernel 
of  a practical  operating  system  is,  on  the  other  hand,  likely  to  be 
a document  some  inches  thick.  Such  massiveness  will  do  little  to 
i instill  confidence  with  respect  to  the  absoluteness  of  any  proof  . 

We  are  convinced  that  there  is  an  imperative  that  the 
verification  of  critical  properties  of  critical  programs  be 
pronounced  by  machines.  While  there  are  many  difficulties  to 
overcome  before  this  becomes  a practical  alternative,  a matter 
discussed  at  some  length  below,  we  should  like  to  point  out  that 
there  exists  at  least  one  domain  in  which  the  superiority  of 
mechanical  over  the  manual  verification  has  become  almost 
commonplace.  Every  entry  in  symbolic  mathematical  tables,  e.g., 
tables  of  definite  or  indefinite  integrals,  as  well  as  every 
symbolic  computational  exercise  in  every  physics  or  engineering  text 
represents  a theorem.  Since  1964  mechanical  symbolic  computation 
systems  such  as  MATHLAB,  Engelman71,  or  SIN,  Moses71,  have  been 
discovering  errors  in  well-renowned  tables  and  texts.  In  some 
cases,  these  have  corresponded  to  false  proofs;  in  other  cases,  to 
false  theorems. 

The  most  desirable  of  situations  with  respect  to  the  automatic 
verification  of  the  correctness  of  programs  would  be  for  the 
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programs  to  be  automatically  generated  by  theorem  provers  as 
exemplified  Dy  Green69  or  Waldinger69.  The  work  of  Engelman7^4  would 
represent  a middle  position  in  which  the  validity  of  each  of  the 
pattern  based  program  transformation  rules  could  be  proved 
independently  and  the  validity  of  any  generated  program  would  then 
follow  as  a consequence.  Short  of  such  automatic  program  synthesis 
schemes,  the  ideal  in  mechanical  program  verification  would  be  for 
the  proofs  to  be  generated  by  a mechanical  theorem  prover,  cf. 
Moore73-  Unfortunately,  there  is  little  evidence  that  any  such 
schemes  show  promise  of  developing  within  the  next  decade  to  the 
stage  that  they  can  handle  anything  approaching  current  systems 
software  or  applications  programs. 

While  we  must  remain  skeptical  about  the  capability  of  theorem 
provers  to  generate  demonstrations  of  properties  of  large  programs — 
in  fact  there  are  quite  discouraging  complexity  results  with  respect 
to  theorem  provers,  cf.  Fischer?1* — it  may  remain  possible  that  much 
can  be  done  in  the  way  of  providing  mechanical  verification  of 
proofs,  constructed  either  manually  or  in  semi-automatic 
environments,  employing  mechanical  aids.  For  example,  proof 
checkers  for  the  first  order  predicate  calculus  are  available 
(Weyrauch?1* ) . 

Of  great  interest  are  several  systems  under  development  that 
are  attempting  to  find  a middle  ground  between  automatic  proof 
generation,  which  seems  too  remote  a goal,  and  mechanical  proof 
checking,  which  places  too  tedious  a burden  on  the  person  supplying 
the  proof.  These  systems,  all  of  which  appear  quite  similar  on  the 
surface,  are  efforts  to  create  a program  verification  environment  in 
which  the  user  has  on-line  access  to  a variety  of  tools: 
bookkeeping,  proof  checking,  automatic  simplifiers  arid,  especially, 
verification  condition  generators.  The  most  prominent  among  those 
systems  that  exist  in  experimental  form  are  those  at  the  IBM  Watson 
Research  Center  (IBM),  King75,  the  Information  Sciences  Institute, 
University  of  Southern  California  (ISI),  Uncapher75,  pp.  1-13.  the 
Stanford  Artificial  Intelligence  Laboratory  (SAIL),  Luckham76, 
Stanford  Research  Institute  (SRI),  Elspas7b,  Systems  Development 
Corporation  (SDC),  Schorre75,  the  University  of  Texas  at  Austin 
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(TEXAS),  Good7^,  and  XEROX  Palo  Alto  Research  Center  (XEROX), 
Deutsch73- 

We  shall  attempt  a composite  description  of  how  these  systems 
are  intended  to  operate.  All  are  based  on  Floyd's  inductive 
assertion  method  of  proving  properties  of  programs,  Floyd67.  The 
first  portion  of  each  system  is  a parser  that  accepts  programs 
written  in  a specific  language  (IBM:  a subset  of  PL/I;  ISI  and  SAIL: 
PASCAL;  SDC : A LISP-like  language;  SRI:  a subset  of  J0VIAL/J3 ; 

TEXAS:  GYPSY)  as  well  as  certain  formal  statements  about  the  program 
supplied  by  the  user.  These  include  input  predicates  and  output 
predicates  that  define  the  "correctness"  of  the  program  and  a set  of 
inductive  assertions  associated  with  the  program  loops.  These  are 
translated  into  some  appropriate  internal  form.  The  translated 
program  and  annotations  are  then  passed  to  a verification  condition 
(VC)  generator.  The  output  of  this  generator  is  a set  of 
verification  conditions,  i.e.,  statements  in  some  first  order 
predicate  calculus  generally  involving  arithmetic,  equalities, 
inequalities,  and  some  programming  structures  such  as  stacks, 
arrays,  or  lists.  That  is,  the  verification  generator  maps  the 
problem  of  proving  properties  about  programs  into  a problem  of 
proving  statements  in  some  formal  logic.  In  order  to  do  this,  the 
system  must  contain  axiomatic  knowledge  about  the  semantics  of  the 
programming  language.  This  explains  the  popularity  for  these 
systems  of  PASCAL,  a language  designed  for  structured  systems 
software  production,  amenable  to  formal  axiomatization , cf.  HOARE73. 
The  VCs  generated  depend  strongly  on  the  particular  inductive 
assertions  originally  supplied  by  the  user. 

The  proof  of  correctness,  i.e.,  the  proof  that  the  program 
satisfies  the  output  predicate  on  the  condition  that  the  input 
satisfies  the  input  predicate,  is  thus  mechanically  reduced  to  the 
proof  of  the  VCs.  It  is  this  latter  proof  that  is  intended  to  be 
interactive.  For  this  purpose,  the  system  must  possess  first  some 
basis  of  facts  (axioms  or  theorems)  expressed  in  the  VC  calculus. 
These  facts  may  be  supplied  as  part  of  the  system,  may  be  deduced  by 
the  system  (alone  or  interactively),  or  may  be  introduced  by  the 
user.  Operating  on  this  basis  of  facts  is  a deductive  system  that 
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strives  to  find  proofs  that  the  VCs  follow  from  the  factual  basis. 

In  this  process,  it  may  be  guided  by  the  user.  The  mechanical  tools 
provided  tend  to  be  a melange  of  theorem  provers  and  "black  boxes". 
The  theorem  provers  may  be  complete  for  some  calculi,  e.g., 
resolution  principle  theorem  provers  for  the  pure  first  order 
predicate  calculus  or  a semi-decision  procedure  for  Presburger 
Arithmetic,  or  may  be  heuristic  goal-oriented  processes  applied  to 
non-semi-decidable  domains  such  as  integer  arithmetic.  The  black 
boxes  may  include  tautology  deciders  for  the  propositional  calculus 
(Weyrauch  at  SAIL  has  one  that  can  decide  a tautology  containing 
over  100  variables  in  seconds)  and  simplification  procedures  for 
algebraic  and  conditional  expressions.  In  some  systems,  the  user  is 
allowed  to  introduce  facts  without  proof  (assumptions)  and  some 
post-processing  in  the  form  of  a proof  checker  (cf.  Weyrauch74)  is 
needed  to  confirm  the  verification.  Finally,  these  interactive 
systems  present  varying  capabilities  for  automatic  bookkeeping  to 
help  manage  larger  proof  efforts.  This  is  particularly  important  in 
the  case  of  hierarchical  programs  where  the  proof  is  likely  to  be  an 
extended,  incremental  process.  What  is  needed  in  that  case,  in 
fact,  is  an  automatic  record  of  how  the  proof  hangs  together,  so 
that  only  the  logical  minimum  need  be  proved  anew  when  the 
hierarchical  program  and/or  the  proof  strategy  is  modified. 

The  status  of  these  systems  today  is  that  they  have  been  put 
together  on  an  experimental  basis,  largely  from  previous  software 
components  and  are  capable  of  supporting,  if  rather  ponderously,  the 
demonstrations  of  the  correctness  of  rather  small  programs,  e.g., 
sorting  algorithms.  We  should  like  now  to  list  a number  of 
deficiencies  of  these  systems  as  we  perceive  them.  This  is  not 
intended  as  condemning  criticism,  only  as  indications  of  directions 
in  which  we  are  sure  the  authors  would  agree  more  development  should 
be  encouraged. 

1.  The  size  of  programs  that  can  be  handled.  While  it  is 
hoped  (and  occasionally  claimed)  that  the  support  of  proofs  layered 
h ierarchically  in  levels  of  abstraction  will  permit  the  semi- 
automatic mechanical  verification  of  systems  size  programs,  this  is 
still  an  unredeemed  promise. 
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2.  The  type  of  programs  that  can  be  handled.  Most  programs 
that  have  been  verified  with  the  use  of  these  systems  are  basically 
simple  mathemtical  algorithms,  e.g.,  sorting  algorithms  or  parsers. 
Among  the  systems  programming  areas  in  which  more  work  is  needed 
are: 


a.  Pointers . This  area  is  extremely  active. 

Luckham  (SAIL)  is  working  on  verification  for  pointer  manipulation 
in  PASCAL,  which  supplies  pointers  as  data  types.  The  problem  is 
presumably  more  difficult  in  a language  like  PL/I,  where  a variable 
may  be  a disguised  pointer.  Boyer  and  Moore  (SRI)  are  working  on 
automatic  proofs  for  pointer  manipulating  LISP  programs.  In  pure 
LISP,  of  course,  the  pointer  manipulation  is  an  invisible  property 
of  the  implementation  that  need  only  be  addressed  at  the  level  of 
the  verification  of  the  LISP  interpreter,  not  at  that  of  the 
verification  of  LISP  programs.  However,  Boyer  and  Moore  wish  to 
extend  their  automatic  proof  generator  to  an  extended  LISP  that 
admits  user  invocation  of  the  functions  RPLACA  and  RPLACD.  Crocker 
(ISI)  is  developing  formalisms  for  the  manual  verification  of 
pointer  manipulating  routines  in  assembly  language  as  part  of  his 
general  goal  of  developing  proofs  of  properties  of  the  ARPANET  IMP 
software,  e.g.,  guaranteeing  that  a message  will  be  delivered. 

b.  Concurrency.  Another  active  area.  TEXAS,  for 
example,  has  included  concurrency  and  synchronization  dictions 
in  the  design  of  GYPSY  in  order  to  provide  a vehicle  for 
verified  systems  programming. 

c.  Actual  machine  arithmetic  in  contrast  to 
idealized  mathematical  domains. 

d.  I/O.  Seemingly  always  ignored. 


e.  Error  handling. 

f.  Heuristic  Programs  for  which  "correctness"  is 

elusive . 
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g.  At  the  extreme,  McCarthy  and  Weyrauch  (SAIL)  wish 
to  develop  verification  techniques  for  programs  that  "interact 
with  the  real  world". 

3-  Properties  of  programs.  Proofs  of  correctness  often  ignore 
questions  of  termination.  We  also  need  approaches  to  the 
verification  of  the  performance  of  programs,  e.g.,  guarantees  that 
concurrent  processes  will  not  only  not  deadlock,  but  that  certain 
ones  will  be  completed  within  a specified  interval. 

More  integrated  systems.  The  need  for  integration  of  the 
verification  process  is  the  main  theme  of  this  paper.  It  applies 
equally  to  the  semiautomatic  verification  systems  we  are  now 
discussing.  For  example,  verified  PASCAL  programs  should  be  compiled 
by  verified  PASCAL  compilers.  More  particularly,  though,  certain 
questions  arise  out  of  the  way  in  which  these  systems  have  been 
slapped  together  from  existing  components.  For  example,  one  might 
prefer  the  construction  of  axiomatically  defined  algebraic 
simplifiers  to  the  use  of  ad.  hoc  systems  such  as  REDUCE. 

5.  Better  human  interfaces.  This  may  turn  out  to  be  the  make 
or  break  issue  with  respect  to  the  eventual  frequent  and  successful 
employment  of  these  systems.  Included  is  the  need  for  conceptually 
simple  bookkeeping  tools  for  the  management  of  hierarchical  proofs 
as  well  as  the  deeper  question  of  user  comprehension  of  the 
verification  conditions.  While  the  representation  of  first  order 
predicate  calculus  formulas  in  clause  form — with  existential 
quantifiers  replaced  by  Skolera  functions  and  both  universal 
quantifiers  and  conjunctions  implicit — may  be  necessary  for  a 
resolution  principle  black  box  within  the  system  (cf.  Nilsson71, 
Chapter  6),  it  is  not  necessarily  the  form  most  comprehensible  by 
the  user.  Some  systems  either  maintain  or  resupply  the  quantifiers. 

The  deeper  question,  however,  is  the  traceability  of  the  VCs  to 
the  inductive  assertions.  We  expect,  for  example,  that  a common 
occurrence  would  be  for  the  interactive  system  to  present  a user 
with  VCs  that  it  had  failed  to  demonstrate  and  ask  him  for  guidance. 
After  a short  while,  the  user  might  realize  that  this  set  of  VCs 
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could  not,  in  fact,  be  proved  in  general  and  that  it  is  necessry  to 
go  back  and  strengthen  some  of  the  inductive  assertions  that  he  had 
annotated  to  the  program  loops.  But  which  and  how?  Essentially, 
the  system  must  be  somehow  capable  of  "decompiling"  the  work  done  by 
the  VC  generator  in  order  to  supply  the  required  traceability. 

Furthermore,  much  may  have  to  be  learned  not  only  about  the 
capabilities  we  can  expect  in  mechanical  verification  systems  but 
also  about  the  complementary  questions  of  how  to  structure,  specify, 
and  implement  large,  complex  programs  in  order  to  facilitate  their 
mechanical  verification.  Work  at  SRI  relevant  to  the  structuring  of 
programs  as  complex  as  security  kernels  is  reported  in  Robinson75 
and  Robinson75a.  They  employ  the  technique  of  Parnas  specifications 
(Parnas72)  to  define  the  program  as  a hierarchy  of  levels  of 
abstraction,  each  represented  by  an  abstract  machine.  The  intent  is 
that  this  structuring  and  specification  technique  will  abet  both  the 
construction  of  secure  systems  and  the  proofs  of  their  security,  the 
proofs  themselves  being  layered  by  the  identical  hierarchy.  A 
realistic  appraisal  of  the  likely  power  of  mechanical  validation 
tools  will  probably  lead  to  the  concession  that  the  proof  techniques 
must  be  adapted  to  the  limitations  of  the  semiautomatic  verifier  and 
that  the  program  specification  technique,  in  turn,  must  be  adapted 
to  the  restrictions  of  the  proof  technique.  That  is,  it  would  not 
seem  reasonable  to  expect  a posteriori  mechanical  proofs  of 
arbitrarily  written  programs.  The  technologies  of  proof, 
specification,  and  implementation  must  be  integrated. 

Complete  Software  Verification 

Our  discussion  of  the  problem  of  verifying  the  kernel  of  a 
multi-security-level  on-line  computer  system  till  now  has  been  ir. 
general  outline  that  being  followed  by  the  ESD  project  on  the 
Security  Kernel  Based  Multics  System  involving  ESD,  MITRE,  MIT,  SRI, 
and  Honeywell  (Burke7^).  That  is,  we  have  been  assuming  a security 
policy  enunciated  by  DoD,  a mathematical  model  that  formalizes  that 
policy,  a kernel  specification  that  adheres  to  that  model,  and  an 
implementation  that  adheres  to  that  specification.  The  verification 
of  the  kernel  would  then  consist  of  a proof  that  the  specification 
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satisfies  the  model  and  a second  proof  that  the 

implementation  satisfies  the  specification — this  latter  to  be  more 
accurately  viewed  as  a hierarchically  structured  sequence  of  proofs. 
To  this  we  have  added  our  own  conviction  that  the  verifications  need 
to  be  checked  mechanically.  Appropriately,  we  understand  that  SRI 
wishes  to  commit  itself  to  semiautomatic  mechanical  verification  for 
the  future  of  that  project  (Levitt76). 

In  order  to  complete  the  chain  of  software  verification,  we 
should  start  by  asking  in  which  language  the  implementation  is 
written.  It  would  seem  unlikely  that  this  should  be  the  machine 
language.  There  is  a tendency  today  to  write  systems  in  languages 
such  as  BLISS  (Wulf70)  or  PASCAL  (Wirth75)  that  provide  structured 
programming  support  for  systems  software.  In  the  case  of  a MULTICS 
kernel,  PL/I  may  be  difficult  to  resist  since  the  balance  of  the 
MULTICS  operating  system  is  written  in  it. 

The  use  of  a higher  (or  medium)  level  language  presents  three 
problems.  First,  it  is  necessary  that  the  language  itself  have  a 
rigorously  defined  semantics  so  that  the  program  proofs  are 
meaningful.  Second,  there  must  be  a means  of  transition  from  the 
proofs  for  the  implementation  language  programs  to  the  verification 
of  the  machine  code.  There  would  seem  to  be  two  distinct  approaches 
available  to  deal  with  this  transition.  One  would  be  an  additional 
layer  of  proofs  that  the  machine  code  produced  by  the  compiler  (or 
some  hand  modified  version  of  that  code)  corresponds  to  the 
implementation  language  code.  The  other  approach  would  be  to  make  a 
capital  investment  in  the  verification  of  the  correctness  of  a 
compiler  for  the  implementation  language.  This  is  certainly  the 
more  attractive  alternative  since  changes  in  the  system 
specification  and/or  the  verification  effort  itself  are  likely  to 
impose  the  need  for  changes  in  the  program.  Third,  the  hardware 
instructions  must  themselves  bear  a formally  defined  semantics  in 
order  that  the  verification  of  the  programming  language  to  machine 
language  transition  be  definable.  Actually,  the  need  for  formal 
machine  language  semantics  would  obviously  exist  even  if  the  subject 
program  were  written  in  machine  language.  We  shall  return  to  this 
point  below. 
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Finally,  the  mechanical  verifier,  itself,  must  be  verified. 

This  leads  to  the  conclusion  that  it  must  be  simple  enough  to  permit 
a convincing  manual  verification  (or  a proof  within  the  formal  scope 
of  a manually  verified  mechanical  proof-checker).  The  price  of  this 
might  be  a weakness  in  the  tools  provided  and  would  be  reflected  in 
limitations  on  the  permissible  proof  methodology.  This  latter  would 
serve  in  turn  to  restrict  the  permissible  program  specification  and 
implementation  techniques.  The  final  effect  might  be  to  measurably 
increase  the  cost  of  verified  software. 

A possible  solution  might  derive  from  our  tacit  assumption  that 
a program  verifier  is  capable  of  proving  programs  more  complicated 
than  itself.  It  might  thus  be  possible  to  approach  a more  powerful 
verifier  through  a bootstrapping  sequence  of  verifiers  of  which  only 
the  initial  one  would  require  a manual  verification.  However,  we 
must  keep  in  mind  that  we  are  dealing  here  with  a rather 
encompassing  definition  of  correctness,  not  some  relatively  simple 
property  such  as  security  preservation. 

Kernel  Minimization 

We  know  that  automatic  proof  generation  grows  super- 
exponentially  with  program  complexity  (Fischer74).  We  have  been 
tacitly  assuming  that  the  mechanical  (and  manual)  verification  of 
program  properties  can  be  restrained  by  hierarchical  program 
specification  techniques,  such  as  those  expounded  by  SRI,  to 
bearable,  though  extremely  tedious,  magnitudes.  The  success  of  such 
an  approach  to  the  problem  of  the  verification  of  the  security 
preservation  of  complex  operating  systems  might  depend  on  our 
ability  to  restructure  such  systems  so  as  to  minimize  the  complexity 
of  the  required  kernel.  Perhaps  a less  M hoc  study  of  operating 
systems  architecture  from  this  point  of  view  would  be  profitable. 

Impact  of  Distributed  Computation 

Distributed  computation,  e.g.,  to  support  survivable  command 
and  control  systems,  may  introduce  additional  issues  into  the 
general  problem  of  software  verification.  These  are  difficult  to 


w 


14 


C.  Engelman 


A PLANNER'S  VIEW  ..  . 


anticipate  since  the  logic  of  such  computation  systems  and  the 
nature  of  tne  associated  software , even  the  appropriate  programming 
languages,  are  little  understood  today. 

It  may  be  that  the  utilization  of  integrated  circuit  based 
microprocessors  will  transfer  some  of  the  "program"  from  software  to 
hardware  and  this  might  introduce  novel  verification  problems.  This 
point  is  mentioned  again  below. 

Operating  systems  often  consist,  of  course,  of  a number  of 
simultaneous  processes.  Parallelism  introduces  special  difficulties 
for  verification  techniques,  and  some  study  has  been  made  of  this 
problem,  e.g.,  Lipton75,  Ashcroft73.  or  Levitt72.  Nonetheless,  this 
added  complexity  might  serve  to  motivate  an  attempt  to  restrict  the 
parallelism  of  the  security  kernel,  as  contrasted  to  that  of  the 
extended  operating  system,  of  a multi-level  computing  system.  Such 
options  may  not  be  available  in  the  case  of  distributed  processing. 

Hardware  Verification 

The  question  of  the  verification  of  the  correctness  of  a 
program  or  of  its  enjoying  any  other  property,  such  as  termination 
or  security  preservation,  must  resolve  finally  to  the  same  property 
being  enjoyed  by  a machine  language  representation  of  that  program. 
We  must  therefore  obtain  rigorous  formalizations  of  the  semantics  of 
machine  instruction  sets.  A partial  effort  in  this  direction  is,  in 
fact,  included  in  the  machine  language  handbook  for  at  lease  one 
computer,  the  PUP-1 1/^5-  That  description , based  on  the 
specification  for  each  instruction  of  its  effects  on  a fixed  set  of 
state  variables,  might  suffice  with  perhaps  a bit  more  formal 
description  of  the  process  interpreter,  including  interrupt 
handling.  On  the  other  hand,  it  could  be  argued  that  a complete 
proof  would  also  involve  the  axiomatization  of  the  arithmetic  and 
logical  operations.  The  thought  of  carrying  the  proofs  of 
properties  of  complex  programs  down  to  such  detail  is  rather 
intimidating,  tnough  Crocker  (ISI)  has  some  such  ambitions. 
Significant  work  on  the  verification  of  microcode  is  in  progress  at 
the  IBM  Watson  hesearch  Center. 
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In  the  case  of  microcomputers  in  which  certain  normally 
software  functions  are  delegated  to  hardware,  we  must  extend  our 
proof  methods  to  the  circuitry  just  to  obtain  a proof  at  the  level 
of  current  program  verification.  In  the  hierarchical  approach  to 
program  validation  being  studied  at  SRI,  the  suggestion  has  been 
made  that  the  "program"  be  structured  and  verified  without  concern 
as  to  which  levels  of  abstraction  represent  software  and  which 
hardware,  with  the  boundary  being  drawn  a posteriori.  However,  most 
microprocessors  will  be  purchased  from  manufacturers  with  no 
customer  control  over  their  internal  logic.  Even  if  the  logic  is 
formally  specified,  its  implementation  might  turn  out  to  be  dictated 
by  engineering  and  production  considerations.  There  may  be 
questions  therefore  as  to  whether  it  would  necessarily  exhibit  the 
"good  structure"  we  assume  in  software  designed  with  verification  in 
mind. 

There  is  another  issue  in  hardware  verification  that  must  also 
be  dealt  with.  It  is  the  question,  especially  with  respect  to 
multi-level  secure  systems,  of  trusting  conventional  hardware  to 
meet  its  specifications.  Errors  in  hardware  design  may  invalidate 
permanently  any  software  verification  while  hardware  failures  may  do 
so  temporarily.  One  response  to  this  problem  is  the  so-called 
"subverter",  that  is,  the  purposeful  introduction  of  a process  into 
the  operating  system  whose  purpose  is  to  periodically  test  the 
correctness  of  the  hardware.  This  has  been  done  for  the  testing  of 
the  memory  access  protection  mechanisms  in  MULTICS  operating 
systems.  The  extension  of  this  technique  to  properties  other  than 
security  preservation  is  far  from  clear.  Even  in  the  case  of 
security  preservation,  the  verification  of  the  kernel  depends  on  the 
validity  of  all  computer  instructions,  not  just  the  memory  access 
mechanisms.  More  safety  could  perhaps  be  achieved  by  the 
introduction  of  a "software  subverter"  that  tried  periodically  to 
run  through  many  of  the  kernel  routines  permuting  the  values  of  the 
significant  state  variables.  More  evidence  as  to  what  accidents 
might  have  occurred  could  be  obtained  through  the  complete  audit  of 
all  memory  accesses,  cf.  Engelman76. 
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Perhaps  the  most  promising  attack  on  this  problem  is  the 
facility  incorporated  in  GYPSY,  Ambler76,  for  specifying  dynamic 
consistency  checks  and  incorporating  these  into  the  static  program 
verification. 

The  final  hardware  problem,  and  perhaps  the  most  difficult,  is 
that  of  the  hardware  Trojan  horse.  This  is  the  residual  paranoia 
that  remains  after  all  the  software  has  been  mechanically  verified, 
the  verifier  has  been  verified,  the  compiler  has  been  verified,  the 
communication  lines  adequately  encrypted,  the  personnel  subject  to 
rigid  security  clearance  procedures,  etc.  How  then  can  we  be  sure 
that  the  computer  manufacturer  is  not  a spy?  How  can  we  be  sure 
that  some  machine  instruction,  disguised  as,  say,  an  integer 
multiplication,  does  not — in  addition  to  its  purported  function — 
periodically  copy  a secret  file  into  a non-classif led  area? 

Perhaps,  the  most  reasonable  answer  to  this  question,  if  one  really 
wants  to  deal  with  it,  is  that  it  should  be  possible  to  devise 
acceptance  tests  that  would  reduce  the  probability  of  such  a 
hardware  Trojan  horse  going  undetected  to  a value  lower  than  our 
reasonable  expectation  that  cleared  personnel  are  engaging  in 
espionage . 

The  Cost  Effectiveness  of  Verified  Software 

Finally,  we  would  like  to  suggest  that  as  the  software 
verification  technology  progresses,  we  try  to  distill  some  of  our 
experiences  so  that,  at  the  start  of  each  Air  Force  software 
development  project  we  could  estimate  the  additional  costs  involved 
in  verifying  the  security,  the  integrity,  the  correctness,  etc.  of 
the  system.  Cost  effectiveness  is  much  more  elusive.  Having 
estimated  the  cost  of  verification,  how  do  we  assign  a value  to  the 
verification.  We  have  in  mind  particularly  those  cases  in  which  the 
requirement  for  verification  derives  from  operational  necessities. 

In  the  case  of  the  automatic  allocation  of  strategic  weapons,  a 
decision  on  which  our  national  survival  may  depend,  what  is  the 
value  of  knowing  a program  is  correct?  In  the  absence  of  a proof, 
must  we  try  to  function  without  the  automatic  capability  or  is  there 
some  level  of  plausibility  at  which  we  employ  an  unverified  program? 
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The  national  survival  argument  as  an  absolute  is  quite  persuasive, 
but  national  security  arguments  do  not  produce  unbounded  defense 
budgets . 


The  primary  area  needing  more  development  is  that  of  an 
integrated  mechanical  verification  technology.  Proofs  cannot  be 
mechanically  generated.  They  must  be  mechanically  verified.  The 
proof  techniques  must  be  integrated  with  the  mechanical  verification 
technology.  The  program  specification  techniques  must  be  integrated 
with  the  proof  techniques.  The  implementation  technology  must  be 
integrated  with  the  specification  techniques,  the  proof  techniques 
and,  preferably,  a verified  compiler.  The  compiler  should  be 
verified  with  respect  to  formal  machine  instruction  set  semantics. 

In  addition,  we  must  be  mindful  that  for  the  verification  of 
complex  programs  to  become  an  other  than  rare  occurrence,  the 
verification  technology  must  support  not  only  the  verification  of  a 
fixed,  correct  program  but  must  equally  support  the  programming 
effort  through  its  normal  debugging  and  maintenance  phases. 

The  other  area  of  profound  importance,  in  our  estimate,  is  that 
of  tne  demands  placed  on  the  program  verification  technology  by 
distributed  processing.  However,  we  would  suggest  postponing 
development  here  until  more  is  understood  of  the  emerging  hardware 
and  software  structures. 
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PHASE  IN  WHICH  ERROR  IS  DETECTED 
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NEEDS  OF  VERIFICATION  IN  BALLISTIC  MISSILE  DEFENSE 


MAJOR  BEN  A.  JOHNSON 
RESEARCH  AND  DEVELOPMENT  COORDINATOR 
DATA  PROCESSING  DIRECTORATE 
BALLISTIC  MISSILE  DEFENSE  ADVANCED  TECHNOLOGY  CENTER 


INTRODUCTION. 


Advances  in  the  state-of-the-art  of  all  technology  disciplines 
surrounding  the  data  processing  subsystem  in  a Ballistic  Missile  Defense 
System  ultimately  depend  to  a large  extent  on  the  data  processing  subsystem 
for  invocation  and  exploitation  of  those  advances.-  For  example,  advanced 
sensors,  both  electromagnetic  and  optical,  observing  a ballistic  missile 
attack  would  invoke  a phenomenal  real-time  data  base  for  processing  through 
ultra-complex  mathematical/logical  algorithms.  Battle  durations  are 
measured  in  seconds,  responses  in  microseconds.  A 30  second  endoatmospheric 
engagement  requiring  and  implemented  on  a 25  MIPS  computing  system  implies 
that  up  to  750  million  machine  instructions  must  be  discharged  accurately 
and  in  proper  sequence.  The  complexity,  magnitude,  and  responsiveness 
required  of  a data  processing  system  in  such  a role  coupled  with  the  fact 
that  the  only  fully  operational  test  possible  is  during  a short  stint  at 
actual  combat  duty,  imposes  the  emphasis  that  the  BMD  community  must  place 
on  software  testing,  quality  assurance,  and  reliability. 

This  paper  only  provides  a brief  overview  of  the  problems  being 
addressed  by  BMDATC  research  programs  and  a discussion  of  the  basic  approaches 
being  pursued.  More  detailed  information  on  the  approaches  is  provided  by  the 
references  shown  in  the  bibliography. 

The  two  major  Verification  and  Validation  (V&V)  Programs  currently 
being  conducted  by  BMDATC  are  Adaptive  Software  Verification  and  Validation 
and  Advanced  Software  Quality  Assurance. 

ADAPTIVE  VERIFICATION  AND  VALIDATION. 


The  major  objectives  of  adaptive  verification  and  validation  are  to: 

a.  Provide  better  information  regarding  the  performance  limits 
of  BMD  software. 

b.  Significantly  reduce  the  cost  of  testing  BMD  software  to  a 
desired  level  of  confidence. 
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The  former  is  achieved  by  determination  of  the  actual  limits  of 
BMD  processor  performance  as  opposed  to  establishing  adequate  performance 
within  some  predefined  envelope  of  threats.  The  latter  objective  is  to  be 
gained  primarily  through  the  introduction  of  sophisticated  automation  into 
the  testing  process  in  order  to  reduce  tedious  and  time  consuming  manual 
activities. 

Previous  research  in  BMD  software  testing  has  concentrated  on  the 
problem  of  furnishing  a consistent  environment  to  all  levels  of  the 
developing  software  and  in  providing  a sufficiently  realistic  representation 
to  the  final  software  capable  of  at  least  limited  real-time  testing.  The 
problems  associated  with  the  high  costs  of  testing  have  been  attacked  through 
the  earlier  detection  of  errors  when  corrections  are  less  complex  and  less 
likely  to  involve  extensive  retesting.  It  is  still  the  case  that  performing 
a single  test  of  an  analytic  simulation  of  the  software  on  the  actual  real- 
time software  is  an  expensive  process.  At  this  level  of  detail,  the  specifi- 
cation of  the  threat  parameters  as  well  as  the  parameters  which  give  a suffi- 
ciently high-fidelity  representation  of  the  remaining  BMD  subsystems  is  a 
time-consuming,  tedious  process  taking  up  to  two  weeks  of  man-time.  Even 
more  time-consuming  is  the  analysis  of  the  test  results  to  see  if  performance 
was  satisfactory.  While  performance  in  system  terms— e.g.,  the  total  number 
of  penetrators— is  found  relatively  easily,  evaluation  of  the  performance  of 
the  various  subfunctions  of  the  data  processing  subsystem  involves  the 
examination  of  hundreds  of  parameters  in  combination  as  a function  of  time 
during  the  engagement.  Experience  with  systems  such  as  SAFEGUARD  showed 
that  many  man-months  of  effort  could  be  involved  in  analyzing  a complex  test. 

Over  and  above  the  problems  associated  with  defining  a specific  test 
scenario  and  analyzing  the  output  is  the  problem  of  selecting  the  next  test 
to  run,  particularly  if  it  is  to  accomplish  a very  specific  purpose  such  as 
placing  additional  load  on  a specific  software  function.  In  some  cases  the 
appropriate  change  is  obvious,  possibly  just  involving  an  increase  in  attack 
size.  In  other  cases  the  action  of  the  adaptive  portions  of  the  software 
itself,  such  as  resource  allocation,  tend  to  obscure  specific  cause  and  effect 
vis-a-vis  the  threat  scenario.  Under  these  circumstances,  a process  of  trial 
and  error  is  often  needed  to  achieve  the  desired  effect. 

When  these  considerations  are  placed  in  conjunction  with  the  relatively 
large  number  of  tests  generally  required  to  build  confidence  in  the  software. 
It  is  not  surprising  that  the  cost  of  testing  is  so  high.  Estimates  for  thev/ 
broad  spectrum  of  military  software  tend  to  put  testing  costs  at  around  forth 
percent  of  the  total  cost.  For  BMD  systems  with  their  complexity  and  require- 
ments for  high  confidence  at  deployment  the  fraction  is  probably  higher. 
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The  major  functional  components  of  a hypothetical  adaptive  tester 
being  developed  by  BMDATC  are  shown  in  Figure  1.1.  The  role  of  each  unit 
is  as  follows: 


a.  Automated  Scenario  Generator  (ASG):  This  component  accepts 

data  from  the  analyst  or  from  the  Parameter  Perturbation  Algorithm  and 
prepares  the  working  data  base  containing  all  the  information  required  for 
the  development  of  a complete  scenario  by  the  Interactive  Testbed.  In 
interfacing  with  the  analyst,  particularly  during  initialization  of  an 
adaptive  testing  cycle,  an  interactive  graphics  capability  is  contemplated. 

b.  Interactive  Testbed  (ITB):  The  Interactive  Testbed  inter- 

faces with  the  data  processor  or  software  under  test,  receiving  output  from 
it  and  modifying  the  simulated  environment  accordingly  in  order  to  provide 
a realistic  data  stream  as  input. 

c.  Performance  Evaluator:  This  component  accepts  data 

recorded  during  testing  from  both  the  ITB  and  object  under  test.  The  data 

is  then  reduced  according  to  predefined  rules  to  produce  performance  measures. 

d.  Parameter  Perturbation  Algorithm  (PPA):  This  function  forma  the 
"brain"  of  the  Adaptive  Tester.  Reacting  to  the  observed  performance  of  the 
test  object,  the  PPA  determines  the  values  to  be  assumed  by  the  parameters 
which  define  a scenario  for  the  next  test  and  passes  them  to  the  scenario 
generator.  The  selection  criteria  are  designed  to  efficiently  drive  the 

test  object  to  the  point  at  which  "failure"  occurs. 

ADVANCED  SOFTWARE  QUALITY  ASSURANCE. 

The  objectives  of  this  project  are: 

a.  To  develop  techniques  for  constructing  verifiable  BMD 

software. 


b.  To  define  tools  and  techniques  for  proving  that  BMD 
software  is  consistent  with  requirements. 

c.  To  refine  methods  for  implementing  fault-tolerance  in  BMD 

software. 

The  current  technique  for  validating  software  performance  is  testing. 
Testing  techniques  have  a limited  capability  for  validating  compliance  with 
requirements  and  provide  an  indication  of  performance  boundries.  The  extreme 
size  of  BMD  software  and  the  number  of  combinations  and  permutations  in  the 
Input  space  prevent  even  approaching  a complete  test  of  the  software.  In 
addition,  singularities  in  software  performance  such  as  errors  caused  by 
Incorrect  logical  sequences,  erroneous  variable  substitution,  or  machine 
overflow  and  underflow  are  often  only  detected  by  accident.  During  the  past 
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five  to  ten  years  a large  amount  of  theoretical  research  has  been 
conducted  on  techniques  of  assuring  program  correctness,  largely  by 
academic  theoreticians.  This  research  has  ranged  from  total  automatic 
theorem  proofs,  which  proves  a program  correct,  to  techniques  of  symbolic 
execution,  which  provide  an  algebraic  equation  that  describes  the  operation 
of  the  code.  This  research  has  been  restricted  to  small  programs  and  has  in 
general  excluded  a number  of  real  world  requirements  such  as  real  numbers, 
real-time  operation  and  concurrency  of  operations. 

The  BMDATC  research  in  advanced  quality  assurance  is  designed  to 
extend  these  powerful  software  verification  techniques  to  BMD  software. 

The  specific  tasks  to  be  performed  under  this  effort  include  development 
and  application  of  an  executable  assertion  language,  development  of  experi- 
mental preprocessors,  and  implementation  of  a software  quality  laboratory. 

An  assertion  language  will  be  designed  which  states  intended  input/ 
output  usage  of  program  variables  and  describes  relationships  among  program 
variables  upon  entry  and  exit  from  a module.  Definition  of  a convenient- 
to-use  syntax  which  expresses  these  properties  will  be  the  first  issue 
addressed.  Further  research  issues  include  extending  the  assertion  language 
to  enable  proof-of-correctness  and  fault-folerance  techniques  to  be  applied 
in  BMD  software  development.  The  initial  definition  of  the  assertion  language 
will  be  an  appropriate  extension  to  FORTRAN-or  PASCAL-based  languages.  Enhance- 
ments to  the  assertion  language  may  depend  on  language  features  (such  as  user- 
defined  data  types)  available  in  PASCAL  but  not  in  FORTRAN. 

An  experimental  verification  condition  generator  will  be  implemented 
which  accepts,  as  input,  programs  annotated  with  entry/exit  assertions, 
provides  interactive  assistance  in  generating  loop  invariant  assertions, 
and  automatically  maps  fully  asserted  programs  into  verification  conditions 
(logical  formulas  suitable  for  theorem  proving  analysis).  Programs  written 
In  a subset  of  FORTRAN  or  PASCAL  will  be  processable. 

Preprocessors  will  be  implemented  which  will  translate  assertion 
language  extensions  to  FORTRAN  and  PASCAL  into  statements  acceptable  to 
the  base  language  compiler.  The  preprocessor  approach  was  chosen  in  order 
to  minimize  implementation  effort.  Implementing  the  translation  of  executable 
assertion  language  input,  output,  entry,  exit,  and  assert  statements  into 
executable  FORTRAN  statements  is  the  initial  goal  of  this  element.  A later 
goal  is  to  implement  a similar  translation  into  executable  PASCAL  statements. 


The  software  quality  laboratory  will  consist  of  implementations  of 
techniques  for  checking  the  consistency  of  executable  assertion  statements 
within  software  modules.  The  specific  techniques  to  be  implemented  test 
structural  characteristics,  physical  units,  module  interfaces,  and  clean 
termination.  A major  goal  of  this  effort  is  to  quantitatively  assess  the 
enhanced  error  detection  capability  provided  by  these  consistency  analysis 
techniques  during  a BMD  software  implementation  using  assertion  statements 
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Purpose 


• TO  ESTABLISH  A WORKING  UNDERSTANDING  OF  COMPUTER 
PROGRAM  VERIFICATION  AND  VALIDATION 


• TO  DISCUSS  THE  MANAGEMENT  ASPECTS  OF  ACQUIRING  A 
VERIFICATION  AND  VALIDATION  CONTRACTOR 


• TO  DISCUSS  CURRENT  EFFORTS  AIMED  AT  IMPROVING 
PRESENT  PROCEDURES 


The  purpose  of  this  presentation  is  to  describe  the  techniques  of 
computer  program  verification  and  validation  and  explain  how  they 
can  be  successfully  employed  to  improve  the  quality  of  weapons 
system  software.  In  order  to  achieve  this  objective,  we  shall  first 
establish  a working  understanding  of  computer  program  verification 
and  validation  as  it  is  practiced  today.  Then,  we  shall  discuss 
management  practices  and  procedures  for  acquiring  the  services  of 
an  independent  contractor.  Finally,  we  shall  explain  our  current 
work  aimed  at  improving  the  state-of-the-art. 
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The  Space  and  Missile  Systems  Organization  (SAMSO)  of  the  U.S. 
Air  Force  does  not  develop  its  own  computer  software.  Instead,  it 
depends  on  contractors  to  provide  it  as  part  of  the  weapons  systems 
it  acquires.  This  dependence  has  created  problems  more  related  to 
the  methods  it  uses  for  acquisition  management  than  those  it  uses 
for  software  development.  One  of  the  most  successful  approaches 
applied  by  SAMSO  Program  Offices  to  cope  with  these  acquisition 
management  problems  is  computer  program  verification  and  valida- 
tion. Verification  and  validation  are  techniques  used  to  provide 
management  with  systematic  assurance  that  the  computer  programs 
they  acquire  will  perform  their  mission  requirements  economically, 
efficiently,  and  correctly. 
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Software  acquisitions  at  SAMSO  are  developed  using  a sequential  life 
cycle  as  defined  in  AFR  800-14,  Volume  2 and  other  applicable 
military  regulations.  The  purpose  of  so  ordering  the  development 
is  to  create  a series  of  validated  baselines  upon  which  the  computer 
software  products  can  be  developed  and  tested.  Typically,  these 
baselines  are  documents  which  specify  either  software  requirements 
or  computer  programs  as  actually  built.  Management  of  the  sequen- 
tial life  cycle  is  accomplished  by  conducting  a series  of  reviews  and 
audits  as  described  in  MIL-STD- 1 52 1 . The  purpose  of  these 
meetings  is  two-fold.  First,  they  provide  management  with  an 
assessment  of  the  project  status.  Then,  they  are  used  to  pinpoint 
technical  problems  that  require  attention  and/or  action. 
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Reviews  are  preceded  by  an  examination  of  pertinent  documents  and/ 
or  code.  The  actual  masses  of  paper  produced  to  describe  the  com- 
puter program,  its  implementation,  and  the  tests  that  qualify  it  are 
enormous.  More  often  than  not,  sufficient  time  and/or  resources 
are  not  available  to  do  an  adequate  job.  What  results  is  a review 
that  uncovers  inconsistencies.  The  technical  basis  and  decisions 
upon  which  the  design  is  founded  are  frequently  not  challenged.  More 
and  better  feedback  is  required  throughout  the  development  cycle  to 
provide  management  with  assurance  that  their  computer  software 
will  perform  as  specified  and  will  meet  all  of  its  design  goals. 
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Computer  program  verification  and  validation  provide  management 
with  the  feedback  they  requi^  to  assess  the  integrity  of  their  soft- 
ware products.  Computer  program  verification  is  the  iterative 
process  of  continuously  determining  whether  the  product  of  each 
step  of  the  computer  program  acquisition  process  fulfills  all 
requirements  levied  by  the  previous  step,  including  those  set 
for  quality.  Computer  program  validation  is  the  test  and 
evaluation  of  the  complete  computer  program  aimed  at  ensuring 
compliance  with  the  performance  and  design  criteria.  Computer 
program  certification  is  the  test  and  evaluation  of  the  complete  com- 
puter program  aimed  at  ensuring  operational  effectiveness  and 
suitability  with  respect  to  mission  requirements  under  realistic 
operating  conditions. 


Titan  III  V&V  Philosophy 


• INDEPENDENT  CONTRACTORS 

• INDEPENDENT  TOOLS 

• EXECUTABLE  ANALYSIS 
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The  Titan  IIIC  was  one  of  the  first  projects  to  use  computer  program 
verification  and  validation.  Titan  is  an  expendable  booster  used  to 
place  satellite  payloads  into  orbit.  This  launch  vehicle  uses  an 
embedded  computer  system  for  guidance  and  issuing  discretes.  The 
flight  program  consists  of  guidance  equations,  digital  flight  control 
equations  and  interrupts,  inte  rface  logic,  self-test,  sumcheck,  and 
telemetry  formatting  routines.  Contractors  involved  in  producing 
the  flight  program  are  the  Aerospace  Corp.  (guidance  parameters), 
Martin  Company  (reference  trajectory,  pretest  trajectory,  and  range 
safety)  and  Delco  Electronics  (IMU  and  parameter  tapes).  Each  of 
these  contractors  independently  checks  the  other's  work  using  inde- 
pendent sets  of  support  tools  to  ensure  proper  implementation. 
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The  two  efforts  that  are  used  to  flight  certify  the  Titan  IIIC  computer 
programs  are  verification  and  validation.  Validation  is  a gross  test 
to  demonstrate  that  the  equations  as  coded  on  the  flight  tapes  meet 
all  specifications;  whereas,  verification  is  a detailed  test  to  assure 
that  the  tapes  are  coded  in  accordance  with  the  equations  document. 
Although  the  goal  of  validation  and  verification  are  the  same,  the 
role  each  plays  is  vastly  different.  Validation  consists  of  overall 
performance  testing  to  demonstrate  the  adequacy  of  the  equations  as 
coded  under  simulated  flight  conditions.  On  the  other  hand,  the  intent 
of  verification  is  not  to  question  validity,  but  rather  to  check  in  detail 
if  the  flight  tapes  are  coded  and  punched  in  conformance  with  the 
equations  document. 
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Titan  m Verification  Analysis  and  Activities 


• SCALING  AND  DIMENSION  ANALYSIS 

• PATH  USAGE  ANALYSIS 

• OVERFLOW /UNDERFLOW  EXECUTION 

• ACCURACY  ANALYSIS 

• TIMING  ANALYSIS 

• CODING  RULE  ANALYSIS 


The  purpose  of  verification  is  to  ensure  that  the  computer  program 
completely  and  correctly  meets  every  requirement  of  the  Program 
Requirements  Document.  Verification  is  accomplished  by  a process 
of  detailed  tests  which  demonstrate  on  a step  by  step  basis  that  the 
results  of  executing  every  equation,  logic  branch,  and  input/output 
statement  faithfully  satisfy  the  requirements  of  the  specification. 

The  Titan  IIIC  computer  programs  are  tested  for: 


1.  Correctness:  performance  of  the  algorithm  and/or  logic  as 
specified. 

2.  Continuity:  execution  of  specified  transfers  and  selection  of 
proper  branch  destinations. 

3.  Overflow:  generation  of  an  overflow  or  improper  divide  for 
a correlated  set  of  data. 

4.  Accuracy:  generation  of  a result  that  is  accurate  to  within 
the  allowable  least  significant  bit. 
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Titan  El  Verification  Tools 


• OPEN-LOOP  AUTOMATIC  ANALYZER  (Martin) 

. EXECUTES  CODE  IN  FLIGHT  CONFIGURATION  COMPUTER 
. COMPARES  RESULTS  TO  SCIENTIFIC  SIMULATION  RESULTS 

• CODE  COMPARATOR  (Aerospace) 

. EXECUTES  CODE  IN  INTERPRETIVE  COMPUTER  SIMUU\TOR 
. COMPARES  RESULTS  TO  SCIENTIFIC  SIMULATION  RESULTS 


Titan  IIIC  verification  tools  include  the  Martin  Company-developed, 
open-loop,  automatic  analyzer  and  fhe  Aerospace  Corporation- 
developed  code  comparator.  The  difference  between  the  two  tools  is 
that  the  analyzer  uses  the  actual  flight  computer,  while  the  com- 
parator simulates  its  execution  characteristics  using  an  interpretive 
computer  simulator.  In  either  case,  the  flight  code  is  executed  and 
compared,  statement-by-statement,  with  an  independently  coded 
FORTRAN  version  of  the  program.  All  miscomparisons  are  auto- 
matically flagged.  To  pass  the  success  criteria,  arithmetic  differ- 
ences between  the  two  versions  must  be  within  specified  tolerance 
levels.  Integers,  logic  branches,  and  input/output  are  also  checked 
and  must  be  absolutely  as  specified. 
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Titan  IE  Validation  Techniques 


• ICS  VALIDATION 

. RUN  CASES  WITH  NOMINAL  AND  DISPERSED  DATA 
. COMPARISON  OF  SCIENTIFIC  vs  ICS  RUNS 
. COMPARISON  OF  RUN  BETWEEN  CONTRACTORS 
. IMAGE  RESTARTS  TO  FORCE  BACKUP  PATHS 

• HARDWARE  VALIDATION 

. MARRY  FLIGHT  CONFIGURATION  HARDWARE  MOCKUP 
TO  ANALOG  SIMULATION 

. COMPARE  TELEMETRY  DATA  TO  SPECIFICATION  AND 
ICS  RUNS 


Titan  IIIC  validation  is  conducted  to  demonstrate  that  the  equations, 
as  coded,  properly  perform  and  that  the  mission  specifications  are 
met  under  simulated  flight  conditions.  Aerospace  makes  an  inter- 
pretative computer  simulation  (ICS)  run  using  parameter  tapes  for 
the  initial  part  of  the  boost  trajectory.  By  comparing  the  resultant 
output  data  from  the  ICS  trajectory  run  against  equivalent  counter- 
parts in  a scientific  trajectory  run.  Aerospace  is  able  to  test  on  a 
system  basis  whether  the  parameter  tapes  perform  as  expected. 
Delco  and  Martin  also  validate  by  making  independent,  six- 
dimensional, scientific  trajectory  runs  to  test  the  suitability  of  the 
guidance  parameters  in  meeting  specifications.  Other  techniques 
used  for  Titan  mate  hardware  to  analog  and  telemetry  devices. 
These  are  used  to  provide  additional  data  against  which  the  tapes 
can  be  validated. 
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Another  approach  to  computer  program  verification  and  validation  is 
practiced  at  the  Space  and  Missile  Test  Center  (SAMTEC).  SAMTEC 
operates  the  Western  Test  Range,  a widely  dispersed  network  of 
radar  and  optical  sensors,  communication  links,  and  data  processing 
facilities.  Its  basic  mission  is  to  provide  support  to  both  ballistic 
missile  and  satellite  programs.  Facilities  along  the  California 
coast  provide  optical  and  radar  tracking  of  the  missile  or  space 
booster,  transmit  commands,  and  receive  telemetry  during  the 
launch  phase  of  flight.  Facilities  in  the  Hawaiian  Islands  support 
missile  flights  in  mid-course,  while  fixed,  airborne,  and  shipborne 
facilities  provide  the  instrumentation  in  the  terminal  aieas. 
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SAMTEC 

VSV  POLICY 


• ALL  CLASS  1 SOFTWARE  WILL  BE  V&V'ED  PRIOR  TO 
OPERATIONAL  USE 


• CLASS  1 SOFTWARE  IS  REAL-TIME  MISSILE  FLIGHT 
SAFETY  APPLICATIONS  SOFTWARE 


SAMTEC  has  established  the  policy  that  all  Class  1 software  will  be 
verified  and  validated  by  an  independent  contractor  prior  to  being 
placed  in  operational  use.  Class  1 software  encompasses  computer 
programs  written  for  real-time  missile  flight  safety  applications. 
The  potential  dangers  to  life  and  property  warrant  the  additional 
degree  of  protection  afforded  by  this  approach.  However,  it  is  also 
necessary  not  to  incorrectly  abort  a good  missile.  Therefore, 
software  systems  at  SAMTEC  must  support  missile  safety  officers 
in  achieving  both  these  goals. 
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Areas  of  Activity  for  V&V  Contractor 


• NEW  SOFTWARE 

. SPECIFICATION  REVIEW 
. ANALYSIS  ANO  TESTING 

• MODIFIED  OPERATIONAL  SOFTWARE 

. SUBSTANTIVE  MODS 
. NONSUBSTANTIVE  MODS 

• GENERATION  OF  V&V  TOOLS 

. PROPOSAL  AND  SCREENING 
. PREPARATION 

• SPECIAL  STUDIES 

. QUICK  REACTION 
. CRITIQUES  AND  TRAINING 


SAMTEC  has  acquired  the  services  of  an  independent  verification  and 
validation  contractor  to  help  them  achieve  these  goals.  Contrac- 
tually, the  services  provided  by  this  contractor  include  participating 
in  reviews,  studies,  and  analyses  and  designing  and  conducting  tests 
aimed  at  ensuring  that  critical  requirements  have  been  met.  The 
results  of  using  this  approach  have  been  encouraging.  As  an 
example,  a 25,000  word  program  th  t was  an  integral  part  of  the 
range  safety  system  at  Vandenberg  AFb  for  the  last  eight  years  was 
evaluated.  Twenty  major  errors  were  detected,  seven  of  which  were 
critical  to  range  operations.  In  another  case,  a new  range  safety 
system  was  evaluated.  It  was  determined  to  be  unready  for  opera- 
tional use  and  subsequently  had  to  undergo  considerable  modification. 
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Based  on  our  Titan  IIIC,  SAMTEC,  and  other  project  experiences,  it 
seems  that  there  is  much  an  independent  verification  and  validation 
contractor  can  do.  The  verification  and  validation  contractor  should 
support  document  reviews  and  special  studies.  He  should  always  be 
tasked  to  identify  critical  requirements  and  formulate  a verification 
and  validation  plan.  In  this  way,  his  expertise  can  be  used  to  pro- 
vide the  detailed  technical  feedback  we  need  to  judge  the  soundness 
of  the  design  and  the  integrity  of  the  resulting  products.  As  fund- 
ing permits,  the  independent  contractor  can  analyze  key  algorithms 
and  provide  recommended  improvements  or  alternatives.  He  can 
prepare  test  tools  and  be  tasked  to  independently  test  and  evaluate 
the  code  to  ensure  proper  compliance.  He  can  even  be  tasked  to 
prepare  specifications. 
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Economic  Considerations  of  Software  Testing 
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SOFTWARE  CERT  I F t CAT  I ON  / FAULT  TOLERANCE 


EXTENSIVE  TESTING/ INDEPENDENT  V&V 


FORMAL  TEST  PROGRAM 


SOFTWARE  TESTING  BY  PROGRAMMER 


SOFTWARE  TESTING 


Software  testing  often  accounts  for  40  to  50  percent  of  the  computer 
program  development  budget.  In  spite  of  this  large  investment,  it  is 
generally  agreed  that  the  results  of  testing  are  uncertain.  There- 
fore, the  Air  Force  uses  other  approaches  to  increase  its  confidence 
in  its  final  software  products.  At  SAMSO,  our  reliability  require- 
ments are  so  stringent  that  we  have  to  apply  such  test  techniques  as 
independent  verification  and  validation  and  fault-tolerance.  These 
approaches  cause  us  to  expend  additional  funds  during  development. 
These  funds  can  be  significant.  Yet,  they  are  warranted  because  we 
have  only  a limited  opportunity  to  recover  from  a failure  once  the 
satellite  or  missile  is  launched. 
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Relative  Costs  of  Development  and  Independent  Testing 


NAME 

SIZE 

DEVELOPMENT 
COST  (mm) 

INDEPENDENT 
TEST  COST(mm) 

PERCENT 

A 

LARGE 

750 

252 

33 

B 

LARGE 

750 

300 

40 

C 

MEDIUM 

337 

142 

42 

0 

MEDIUM 

310 

150 

48 

E 

MEDIUM 

112 

38 

33 

F 

SMALL 

12 

6 

50 

G 

SMALL 

12 

6 

50 

H 

MOD 

125 

50 

40 

1 

MOD 

650 

450 

69 

J 

MOD 

300 

112 

37 

R.H.  Thayer  and  E. S.  Hinton,  "software  reliability  - a method  that  works," 
National  Computer  Conference,  1975 


The  concept  of  test  and  evaluation  by  an  independent  contractor 
implies  an  additional  cost  of  some  magnitude.  Based  on  data  from 
ten  different  projects,  these  costs  range  from  33  to  69  percent  of 
that  expended  on  software  development.  The  higher  costs  imply  a 
considerable  analysis  effort,  a heavy  use  of  specially  constructed 
software  tools,  and  a comprehensive  test  and  execution  analysis  of 
the  completed  code.  The  lower  costs  are  associated  with  a 
monitoring  activity  which  has  limited  test  and  execution  analysis 
coupled  to  it. 
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Lessons  Learned 


• V&V  MUST  BE  TAILORED  TO  EACH  PROJECT  USING  IT 

• KEY  V&V  TO  CRITICAL  AREAS  AND  SIGNIFICANT  DISCREPANCIES 

• V&V  HAS  BEEN  SUCCESSFUL  - BUT  ONCE  INITIATED.  MUST  CONTINUE 

• MAINTAIN  CLEAR  ACCOUNTABILITY  FOR  RESULTS 

• USE  EXPERIENCED  PEOPLE 

• INTERFACE  PROBLEMS  ARE  THE  MAJOR  DIFFICULTY 

• COSTS  CAN  BE  REDUCED  USING  AUTOMATED  TOOLS 


I 

i 


Our  experiences  in  applying  computer  program  verification  and 
validation  have  taught  us  some  useful  lessons.  First,  concentrate 
on  the  critical  requirements  and  the  significant  discrepancies 
because  these  are  the  ones  that  it  is  cost  effective  to  prove  or 
remove.  Second,  understand  that  whenever  there  are  multiple 
organizations  working  on  a project  there  are  going  to  be  interface 
problems.  Direct  your  attention  to  them  early  and  try  to  solve  them 
by  making  contractors  accountable  and  by  using  experienced  people. 
Third,  realize  that  costs  can  be  reduced  using  automated  tools. 

Make  your  contractors  justify  the  economics  of  their  use,  though, 
before  you  procure  them.  Fourth,  remember  that  verification  and 
validation  must  be  continued  once  initiated  to  ensure  success. 
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Management  Control 


• CONTRACT 

• TASK  APPROVAL 

• WRITTEN  REPORTS 

• MONTHLY  PROJECT  REVIEWS 

• INTERFACE  CONTROL 

. ALL  ONE  TEAM 

. COMMUNICATIONS  CHANNELS 
OPEN 

. EARLY  INVOLVEMENT 

• TOOLS  DELIVERABLE 


We  have  also  learned  to  manage  our  verification  and  validation 
contracts  better.  We  suggest  that  a fixed  price,  level  of  effort  con- 
tract with  task  approval  be  considered  during  procurement.  This 
type  of  contract  enables  the  project  officer  to  task  the  contractor  to 
perform  special  studies  and  analyses  as  required.  We  suggest  that 
if  tools  are  developed  as  part  of  the  contract,  they  be  screened 
before  their  development  is  approved  to  see  if  they  can  be  made 
generally  applicable.  If  they  can,  we  suggest  they  be  delivered  with 
appropriate  documentation.  Written  reports  for  each  task  and 
informal,  monthly  management  reviews  are  recommended.  We  also 
recommend  that  you  endeavor  to  instill  a team  spirit  and  attack  the 
interface  problems  early  in  the  project. 
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SAMSO  has  initiated  an  advanced  development  program  to  create  the 
Space  Data  Systems  Facility  (SDSF).  The  SDSF  is  a ground-based 
computational  laboratory  which  will  be  used  to  develop  and  verify 
and  validate  flight  programs  (both  computer  programs  and  micro- 
programs) for  a variety  of  space  and  missile  applications.  The 
project  initiated  this  year  will  take  five  years  to  fully  develop.  An 
interim  SDSF  capability  will  be  available  in  two  years.  The  facility 
will  be  housed  at  SAMSO  where  it  may  be  made  available  to  both 
SPO  and  contractor  users.  Tht  facility  will  have  sophisticated 
emulation,  simulation  (both  hybrid  and  digital),  and  diagnostic  capa- 
bilities. A variety  of  standard  software  tools  will  be  available  as 
required. 
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The  purpose  of  this  presentation  was  to  establish  a working 
understanding  of  computer  program  verification  and  validation  and 
to  explain  how  SAMSO  employs  them  to  improve  the  quality  of 
weapons  system  software.  Verification  and  validation  should  be 
employed  throughout  the  life  cycle  to  provide  technical  feedback  as 
to  product  status  and  quality.  When  properly  applied,  the  techniques 
help  uncover  problems  early  so  that  they  can  be  corrected.  The 
cost  of  applying  verification  and  validation  to  large  projects  could 
get  very  expensive  if  our  launch  vehicle  experience  serves  as  a valid 
model.  To  use  these  concepts,  different  strategies  for  their  appli- 
cation need  to  be  formulated  based  on  available  funding  and  assess- 
ments of  technical  risk  and  complexity. 
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Consistency  Checker,  Sunmary 
Michael  Landes 
Rome  Air  Development  Center 

Mr.  Michael  Landes  RADC/ISIM  gave  a presentation  on  the  ongoing 
development  of  the  Consistency  Checker  and  the  Computer  Software 
Specification. 

The  Consistency  Checker,  an  automated  verification  software  tool 
which  collects  a file  of  pertinent  data,  imposes  a methodology  on  the  design 
phase  personnel.  This  data  file  is  designed  compatible  with  CARA  system 
output  and  is  used  to  verify  the  design  phase  against  the  CARA  output  of  the 
requirements  phase. 

The  Computer  Software  S}>ecification  is  a collection  of  sections 
properly  worded  for  insertion  in  a software  Statement  of  Work  to  ensure 
that  elements  of  modem  programming  practices  are  mandated.  This 
evolutionary  vehicle  is  now  under  revision  to  form  a joint  RADC/ESD 
Computer  Software  Specification. 
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AMPIC:  SHORT  DESCRIPTION 


73 


AMPIC  PHASES 
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AMPIC  FORTRAN  EXAMPLE:  CODE  PARTITIONING  (PHASE  I) 
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• : STRUCTURED  REPRESENTATION  (PHASE  II) 


AMP! C FORTRAN  EXAMPLE:  FLOWCHART  WITH  TRANSLATIONS 
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CONVENTIONAL  FLOWCHART  FORMAT 


Next,  AMPIC  will  use  the  previously  derived  information  and  provide  input- 


AMPIC  FORTRAN  EXAMPLE : PATHWISE  RESULTS  (PHASE  IV) 
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AMPIC  COMPILED  CODE  EXAMPLE:  CODE  PARTITIONING  (PHASE  I) 
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AMPIC  COMPILED  CODE  EXAMPLE:  STRUCTURED  REPRESENTATION  (PHASE  II) 


(OUT) 


' 


AMPIC  COMPILED  CODE  EXAMPLE:  TRANSLATION  (PHASE  III) 
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AMPIC  COMPILED  CODE  EXAMPLE:  PATHWISE  STATEMENT  (PHASE  IV) 
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Verification  of  Requirements  and  Design 
Through 

Structured  Analysis  Techniques 
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Software  verification  and  validation  activities  have  as  a 
primary  objective  the  discovery  and  correction  of  errors  in  the  computer 
programs  under  examination.  Many  of  these  errors  are  introduced  by  the 
programmers  during  the  programming  activity  and  go  undetected  during 
the  subsequent  testing  activities.  However,  a significant  number  of 
errors  are  introduced  earlier  in  the  software  development  process  even 
before  the  programming  effort.  An  examination  of  a dozen  different 
aerospace  verification  and  validation  activities  revealed  out  of  a 
total  of  649  significant  errors  that  were  discovered,  246  errors  had 
their  origins  in  the  software  specifications.  Another  study  indicated 
that  in  command  and  control  software,  as  many  as  two-thirds  of  the 
errors  discovered  during  and  after  acceptance  testing  had  their  origins 
in  the  software  specifications. 3 If  one  examines  the  literature  one 
discovers  that  many  new  techniques  and  tools  have  been  produced  that. 


through  improvements  in  the  programming  and  testing  activities,  increase 
software  reliability.  2’3’4’5  There  has  not  been  an  equivalent  effort 
devoted  to  improving  the  techniques  for  the  detection  or  elimination  of 
errors  due  to  inadequacies  in  the  software  specifications.  Does  this 
mean  that  our  existing  techniques  for  the  review  of  software  specifica- 
tions are  adequate?  In  the  examination  of  the  dozen  aerospace  verifi- 
cation and  validation  activities  cited  above,  145  of  the  246  errors 
in  the  software  specifications  were  detected  by  a thorough  review  of 
those  specifications;  101  errors  went  undetected  by  that  review,  were 
passed  on  to  the  coding  and  test  activity,  and  were  only  detected  during 
the  validation  of  the  resultant  code.  The  failure  to  detect  errors  in 
the  software  specifications  early  in  the  development  process  can  lead  to 
increased  software  development  cost.  It  can  cost  10  to  60  times  more  to 
correct  an  error  found  during  testing  than  it  would  have  cost  if  that 
error  had  been  detected  before  programming  was  begun.  Another  factor  is 
that  the  organization  of  software  testing  often  means  that  the  most  signifi- 
cant errors  are  detected  late  in  the  testing  activity.  The  initial  unit 
testing  usually  only  detects  the  errors  made  by  the  programmers  when  the 
unit  under  test  was  programmed.  The  integration  testing  usually  only 
detects  errors  that  result  from  an  incomplete  or  ambiguous  software  design. 
Acceptance  testing,  done  last,  detects  errors  that  were  made  very  early, 
when  the  functions  that  the  program  was  to  do  were  defined.  Clearly  the 
early  verification  of  the  original  functional  requirements  is  essential. 
Equally  clear  is  the  fact  that  the  repertoire  of  techniques  that  has 
been  used  in  the  past  to  verify  software  specifications  and  functional  re- 
quirements needs  to  be  enhanced. 


One  such  enhancement  is  the  use  of  Structured  Analysis  for  verifica- 
tion. Structured  Analysis  was  developed  and  is  currently  used  to  develop 
functional  requirements  ab  initio. ”»2The  use  of  Structured  Analysis  will 
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improve  software  reliability  because  of  the  increased  completeness  and 
correctness  of  the  functional  requirements  just  as  the  use  of  structured 
programming  improves  software  quality  by  reducing  the  complexity  of  the 
resultant  program.  However,  there  is  the  need  to  verify  the  completeness 
and  correctness  of  software  functional  requirements  that  were  not  developed 
through  Structured  Analysis  methods.  Structured  Analysis  can  contribute 
to  satisfying  this  need.  Before  describing  how  this  can  be  accomplished 
a short  discussion  of  the  basic  concepts  of  Structured  Analysis  is  in  order. 

Structured  Analysis  is  a method  for  modeling  complex  systems  in  a 
top-down  hierarchic  form.  The  model  that  results  from  the  application  of 
Structured  Analysis  consists  of  an  ordered  sequence  of  "blueprint"  dia- 
grams. Each  diagram  makes  a complete  statement  to  a clearly  understand- 
able level  of  detail  about  a specific  topic.  Further  detail  about  particu- 
lar functions  is  presented  on  lower  level  diagrams.  Each  lower  level, 
more  detailed,  diagram  connects  exactly  into  the  higher  level,  more  general 
diagram  in  a manner  that  preserves  the  logical  relationship  of  each  com- 
ponent to  the  total  system. 

A complete  model  consists  of  two  types  of  decomposition: 

1)  An  activity  model  that  indicates  the  interrelation- 
ships between  the  activities  the  system  is  to  per- 
form in  terms  of  the  input  data,  output  data,  control 
data  and  mechanisms  that  constrain  each  activity. 

2)  A data  model  that  indicates  the  inter  relationships 
between  data  in  terms  of  the  generating,  using  and 
controling  activities  and  the  storage  mechanisms 

that  constrain  the  data.  The  completeness  and  consis- 
tency of  the  two  models  is  demonstrated  by  an  activity- 
data  tie  which  links  each  element  of  one  model  with 
its  equivalent  in  the  other  model. 

The  graphical  language  used  to  construct  Structured  Analysis  models 
is  very  simple  consisting  of  boxes  to  represent  activities  (in  the 
activity  model)  or  data  (in  the  data  model)  and  arrows  to  represent  the 
constraints.  A diagram,  at  any  level,  can  have  no  fewer  than  three  and 
no  more  than  six  boxes.  This  prevents  any  diagram  from  becoming  overly 
complex  and  forces  decomposition  to  show  details.  Input  arrows  are  al- 
ways drawn  entering  the  left  side  of  a box,  controls  entering  the  upper 
side,  outputs  leaving  the  right  side  and  mechanisms  entering  the  bottom 
side.  A referencing  scheme  is  used  in  Structured  Analysis  to  assign 
unique  and  meaningful  numbers  to  each  activity  or  data  box  to  enable 
rapid  cross-referencing  between  levels.  Finally  a key  element  of 
Structured  Analysis  is  a disciplined  approach  to  the  process  by  which  the 
diagrams  are  produced  and  reviewed. 
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In  developing  a functional  requirements  model  from  the 
beginning  using  Structured  Analysis,  the  responsible  analysts  would 
attempt  to  capture  the  experiences  of  experts  and  users,  the  intent  of 
the  needs  analysis  documentation,  and  the  influences  of  the  environment  in 
which  the  software  system  must  be  developed  and  operate.  In  verifying 
an  existing  functional  requirements  document,  which  has  not  been  developed 
using  Structured  Analysis  techniques,  the  responsible  analysts  attempt  to 
capture  the  content  of  that  existing  functional  requirement  document  in  a 
Structured  Analysis  model.  Errors  in  the  functional  requirements  document 
are  revealed  in  two  ways.  First,  attempting  to  restate  the  requirements 
in  the  more  rigid  and  structured  format  will  reveal  inconsistencies, 
missing  or  incomplete  interfaces,  and  statements  that  are  either  too 
general  or  too  specific.  Although  simple  in  appearance, the  Structured 
Analysis  language  can  be  a tough  task-master;  poorly-conceived  and  poorly- 
expressed  elements  of  the  software  requirements  documents  are  recognized 
as  such  when  one  attempts  to  construct  the  model.  Secondly,  the  Structured 
Analysis  model  is  more  accessible  for  review  and  comment  than  the  original 
requirements  documentation  because  of  the  models  hierarchial  form  and  ex- 
plicit description  of  interfaces.  The  simplicity  of  the  Structured 
Analysis  is  a great  advantage  in  this  respect.  Individuals  with  little  or 
no  software  knowledge  can  be  taught  how  to  "read"  the  diagrams  of  the  model 
in  less  than  two  hours.  This  enables  the  intended  users  of  the  software, 
who  might  be  deterred  from  reading  the  original  requirements  documents 
because  of  their  bulk  and  unfamiliar  language,  to  participate  in  an  effective 
review  of  the  functional  requirements. 

One  of  the  first  things  that  is  observed  when  one  attempts  to 
construct  a Structured  Analysis  requirements  model  from  the  typical  require- 
ments documentation  i-  that  that  original  documentation  is  a mixture  of 
incompatible  levels  of  detail.  Some  requirements  will  be  stated  very 
generally  without  there  being  any  further  decomposition  or  detailing.  Other 
requirements  will  be  stated  in  a very  specific  and  detailed  manner  but  the 
general  requirement  that  provides  the  motivation  for  these  detailed  require- 
ments is  not  provided.  If  one  were  to  imagine  the  requirements  as  being 
organized  as  a tree  structure,  proceeding  from  the  general  to  the  specific, 
one  would  observe  that  many  requirements  documents  have  many  fine  and 
detailed  branches  unattached  to  the  main  limbs  and  many  large  limbs  that 
possess  no  smaller,  more  detailed  branches.  The  missing  portions  of  a 
functional  requirements  document  will  become  apparent  as  the  model  is  con- 
structed. 

The  use  of  Structured  Analysis  is  of  benefit  in  initially 
defining  a requirements/design/code  traceability  matrix.  The  normal  require- 
ments documents  are  relatively  unstructured  from  a requirements  viewpoint. 

The  same  requirement  may  be  repeated  with  different  emphasis  and  from  a 
different  viewpoint  at  several  places  in  the  documents.  Software  require- 
ments may  not  be  clearly  differentiated  from  hardware  requirements.  Detailed, 
specific  requirements  are  interspersed  with  general,  overall  requirements. 

The  construction  of  a requirements  model  will  partition  general  and  specific 
requirements,  will  show  the  relationship  of  software  requirements  to  hard- 
ware requirements,  and  will  point  out  duplications  or  ambiguities.  Each 


98 


requirement  of  the  model  will  have  a unique  "node"  number  that  is  correlated 
both  to  higher  and  lower  requirements  in  the  hierarchy  and  to  the  original 
requirements  documents.  This  node  number  can  then  serve  as  an  identifier 
in  matching  the  design  or  code,  which  themselves  will  be  more  naturally 
structured  than  the  original  requirements,  to  the  requirements. 

The  use  of  Structured  Analysis  to  verify  specifications  is  a 
relatively  recent  application  of  the  technique.  Further  experience 
will  provide  greater  knowledge  about  the  types  of  errors  it  can  and  cannot 
find.  In  applying  this  technique  for  verification,  one  must  constantly  bear 
in  mind  that  one  is  not  constructing  alternate  or  replacement  functional 
requirements  documents.  Rather  one  is  attempting  to  restate,  as  accurately 
as  possible,  what  is  in  the  original  functional  requirements  documents. 

This  can  be  difficult  and  the  degree  of  difficulty  is  proportional  to  the 
lack  of  structure  and  completeness  of  the  original  documents.  Inferring 
the  structure  that  may  be  inherent  in  a mass  of  detail  requires  an  under- 
standing of  that  detail. 

In  Structured  Analysis  one  should  focus  first  on  what  functions 
should  be  accomplished  by  the  software.  Only  later,  when  these  functions 
are  defined  should  one  annotate  the  model  to  include  timing  requirements. 
Unfortunately,  many  real-time  systems  have  their  timing  considerations  al- 
most inextricably  mixed  with  actual  functional  requirements.  For  example, 
a requirement  specification  may  state  that  functions  A,  B and  C are  to  be 
performed  every  second,  and  functions  D,  E and  F every  minute.  However, 
the  grouping  of  functions  A,  B and  C may  only  be  a consequence  of  the 
fact  that  they  need  to  be  performed  at  a higher  rate  than  functions  D,  E and 
F.  In  the  model  the  functions  to  be  performed  must  be  separated  from 
such  timing  requirements  so  that  the  model's  structure  is  dictated  by 
functional  considerations  rather  than  by  the  less  binding  timing  considera- 
tions. Similarly,  the  model  must  be  constructed  with  the  functional  re- 
quirements separated  from  the  implementation  dependent  considerations. 

Even  after  a model  is  constructed,  there  may  be  some  difficulty 
in  communicating  to  the  original  requirement  specification's  developer  the 
deficiencies  in  the  specification.  As  was  indicated  earlier,  the  model  is 
not  a replacement  for  the  original  specification,  and  as  a consequence  all 
criticisms  of  that  specification  should  be  expressed  in  a way  that  is  in- 
dependent of  the  model.  It  may  be  argued  that,  if  the  model  is,  or  can  be 
made  to  be  a superior  definition  of  the  functional  requirements,  the  model 
should  be  used  by  the  developer  as  a basis  for  his  subsequent  development 
activities.  This,  however,  is  undesirable,  as  it  would  destroy  the  separa- 
tion of  roles  between  the  developer  and  validator. 

The  work  that  has  been  accomplished  does  indicate  that  Structured 
Analysis  will  be  beneficial  in  the  review  of  specifications  as  well  as  in 
the  writing  of  specifications.  Structured  Analysis  does  provide  an  organized 
method  and  discipline  for  the  modeling  of  specifications  that  is  compatible 
with  the  concepts  of  software  engineering.  The  use  of  Structured  Analysis 
can  lead  to  the  early  detection  of  deficiencies  in  software  specifications. 
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The  Production  QuaiiXtj  So^-tuxvLe  la  A Problem 


At  the  time  Jules  Verne  was  writing  his  fictional  novel  on  a Journey  to 
the  moon,  no  one  took  him  seriously.  But  yet  in  the  early  1960's  as 
President  J.  F.  Kennedy  issued  his  goal  of  reaching  the  moon  within  a 
decade  the  task  appeared  to  be  much  more  reasonable.  The  major  differ- 
ence between  the  19th  and  20th  century  attitude  can  be  attributed  to 
the  new  tools  like  the  computer  which  have  greatly  extended  man's  scope 
of  feasible  problems. 

As  with  many  of  man's  tools  one  must  utilize  the  computer  with  some 
degree  of  caution. 

Programming  ha3  been  described  by  Dijkstra  as  the  most  complex  mental 
activity  ever  undertaken  by  mankind^1!.  Advances  in  computer  hardware 
capabilities  have  brought  previously  unsolvable  problems  into  the 
scope  of  solvability.  Man's  rapidly  increasing  reliance  upon  the  correct 
behavior  of  complex  software  systems  is  of  definite  concern.  As  new 
systems  realizing  more  complex  hardware  and  software  functions  continue 
to  evolve,  accurate  and  efficient  implementation  of  these  systems  are 
critical  considerations. 
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BACKGROUND  AND  SIGNIFICANCE 

p 

The  concept  of  "structured  programming"  associated  with  Dahl-Dijkstra-Hoare, 

3 

Mills  , et.al.,  was  originally  suggested  as  a vehicle  for  improving 
software  quality  through  a disciplined  development  of  algorithms.  The 
use  of  a restricted  3et  of  control  structures  within  programs  was  one 
part  of  that  concept.  Today,  common  use  of  the  term  "structured  programming" 
is  associated  with  so  many  diverse  activities  that  Dijkstra,  the  term's 

li 

originator,  now  attempts  with  great  vigor  to  avoid  its  use. 

Wirth  presents  another  very  interesting  treatment  of  the  processes  for 
methodically  and  systematically  designing  algorithms.  Wirth  goes  beyond 
the  notions  of  restricted  flow  of  control  and  addresses  issues  of  data 
representation  and  operations  allowed  on  data  in  his  development  of 
PASCAL.5 

In  an  attempt  to  bring  the  benefits  of  "structured  programming"  to  the 
common  people  many  zealous  prophets  have  arisen  each  with  their  own 
new  set  of  restricted  control  structures  and  their  associated  bag  of 
tools,  usually  a preprocessor,  for  allowing  the  use  of  these  control 
structures  with  existing  languages  (e.g.  FORTRAN  and  COBOL)^. 

Another  set  of  tools  has  also  been  introduced  over  the  last  few  years 

7-13 

which  deals  with  control  flow  through  programs.  Software  probes 

or  instrumentation  are  automatically  placed  into  a program  for  monitoring 

the  dynamic  execution  behavior  of  an  algorithm.  Software  probes  in 

the  form  of  source  language  statements  are  inserted  into  the  source  code 

to  gather  statistics  during  program  execution.  These  probes  can  provide 

insight  into  many  aspects  of  algorithmic  behavior  beyond  a simple  flow 

of  control  analysis.  The  notion  of  building  self-metric  (self-measuring) 

lU 

software  has  been  introduced  previously  by  this  author  , however, 

significant  expansion  of  this  concept  is  now  being  explored  as  a vehicle 

15-21 

for  improving  software  quality 
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In  order  to  illustrate  the  type  of  automated  tool  capabilities  currently 
available  and  some  of  the  new  techniques  now  under  consideration,  the 
tool  most  familiar  to  the  author  will  be  described.  It  is  hoped  that 
this  currently  operational  system  will  offer  some  insight  into  the 
concept  of  self-metric  software  and  show  a few  of  the  measurement  schemes 
available  for  dynamic  program  analysis. 

AUTOMATED  TOOLS 

Thz  Program  UvcUllcUok  and  Tz&tzn.  Sij&tCJn  - A Pnoquam  fio*  GenzAating 
SzlfrHzVUc  So  ^tioaAz 

Ongoing  research  is  being  conducted  by  McDonnell  Douglas  Astronautics 
Company  in  the  area  of  software  validation.  Several  powerful  tools 
have  been  and  are  currently  being  developed  for  this  purpose.  A FORTRAN 
implementation  of  the  Program  Evaluator  and  Tester  (PET)  demonstrates 
the  value  of  a self-metric  testing  approach  for  higher-level  languages. 
Similar  tools  for  other  source  languages  are  also  being  developed.  13,1*4,20 

These  tools  essentially  "instrument"  the  software  under  test  by  inserting 
the  software  equivalent  of  sensors  into  the  subject  programs.  This  has 
the  effect  of  making  the  programs  self-measuring.  Each  time  the  software 
under  test  performs  a significant  event,  the  occurrence  of  this  event 
is  recorded,  with  the  exact  nature  of  the  recording  process  depending  on 
the  measurements  desired  and  the  type  of  event  performed. 

SouA.cz  P\ogfiam  InAtAwnzntcution 

Figure  1 shows  a sample  FORTRAN  program  segment  and  its  related  flow  diagram. 
Optional  instrumentation  points  are  identified  for  three  sets  of  sensors 
applicable  to  this  particular  program  segment.  The  execution-count 
sensors  are  shown  at  all  points  on  the  flow  diagram  where  the  internodal 
traversal  frequency  is  not  obvious. 
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These  execution-count  sensors  can  be  most  efficiently  realized  by  associating 
a unique  counting  cell  with  each  sensor  and  inserting  these  sensors  only 
at  those  points  where  logical  program  breaks  occur.  The  min/max/f irst/last 
sensors  for  assignment  statements  appear  following  non-trivial  assignment 
statements.  A trivial  assignment  statement,  or  one  in  which  a constant 
or  literal  is  assigned  to  a variable,  causes  no  instrumentation  to  be 
generated.  The  min/max/f irst/last  sensors  for  DO-loop  control  variables 
are  inserted  after  the  label  or  statement  number,  if  present,  and  before 
the  invocation  of  the  loop  if  variable  parameters  are  used  in  any  of  the 
three  loop-controlling  fields. 

Two  techniques  have  been  experimentally  used  in  implementing  these  and 
several  other  types  of  software  sensors:  (l)  direct  code  insertion,  and 

(2)  invocation  of  run-time  routines.  The  direct  code  insertion  appears 
to  be  faster  in  most  cases  but  the  run-time  routine  is  more  flexible 
in  that  measurements  can  be  more  easily  altered  at  execution  time.  The 
current  PET  system  utilizes  a hybrid  scheme  employing  the  selective 
use  of  both  types  of  instrumentation. 

As  a result  of  running  the  instrumented  program,  a profile  is  produced 
containing  part  or  all  of  the  following  measurements: 

1.  The  number  and  percentage  of  all  potential  executable  source 
statements  which  were  executed  one  or  more  times. 

2.  The  number  and  percentage  of  those  program  branches  taken. 

3.  The  number  and  percentage  of  those  subroutine  calls  which 
were  executed. 

U.  The  number  of  times  each  subroutine  was  called,  together  with 
a list  of  those  subroutines  that  were  never  entered. 
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5.  Relative  timing  on  the  subroutine  level. 

6.  Specific  data  associated  with  each  executable  source  statement. 

a.  Detailed  execution  counts 

b.  Detailed  branch  counts  on  all  IF  and  GOTO  statements 

c.  Optional  data  range  values  (min/max/first/last ) on  assignment 
statements. 

d.  Optional  min/max  ranges  on  DO-loop  control  variables. 

These  summaries  and  detailed  reports  can  be  employed  to  establish  a figure 
for  the  degree  of  testing  to  which  the  program  structure  has  been  subjected. 

The  measurements  performed  allow  us  to  examine  the  internal  structure  of 
a software  system  rather  than  merely  treating  it  as  if  it  were  a "black  box". 
While  the  testing  process  does  not  technically  prove  the  correctness  of  the 
algorithms  used,  it  does  allow  us  to  observe  the  behavior  of  the  algorithms 
with  actual  test  data.  From  these  observations,  the  effectiveness  of 
the  software  validation  process  can  be  greatly  improved. 

Appendix  A contains  a number  of  sample  reports  produced  by  the  PET  system 
along  with  accompanying  descriptions  explaining  their  use. 

Automated  Paogaam  Support  Stjitem  Concept 

The  concept  of  an  Automated  Program  Support  System  offering  assistance  in 
several  new  areas  of  the  software  engineering  process  is  currently  being 
formulated  to  incorporate  present  tools  and  techniques  within  a comprehensive 
collection  of  automated  aids  for  software  development  and  certification. 

This  extensible  system  contains  five  major  components: 

1.  A Program  Development  Subsystem  for  guiding  the  development 
of  well  structured  programs  in  a top-down  fashion; 

2.  A Test  Statistics  Subsystem  for  both  gathering  static 
measurements  and  monitoring  the  dynamic  behavior  of  software 
(an  extension  of  MDC’s  currently  existing  PET  system) 

(see  Appendix  A); 
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3.  A Documentation  Subsystem  for  assisting  in  the  generation 
and  maintenance  of  detailed  software  documentation; 

U.  A Simulation  Subsystem  for  monitoring  the  behavior  of 
assembly  language  software  and  scaling  real-time  data 
used  for  validation  testing;  and 

5.  A Test  Data  Generation  Subsystem  for  synthesizing  and 

22 

assessing  tests  of  critical  software  components. 

Each  component  tool  of  this  system  may  be  applied  to  a given  program 
separately,  although  the  tools  can  be  most  effectively  used  in  a 
disciplined  rational  sequence. 

CuMent  AAeoa  ReacoAch 

The  balance  of  this  paper  will  be  restricted  to  remarks  describing  new 
research  currently  underway  on  the  Test  Statistics  Subsystem  referenced 
above. 

' There  is  currently  a large  gap  between  exhaustive  testing  of  all  potential 

program  paths  and  the  statement  that  all  program  branches  or  segments 
have  been  exercised  at  least  once.  Advance  research  efforts  are  being 
conducted  in  several  areas  in  an  attempt  to  help  bridge  this  gap. 

One  very  promising  area  involves  the  incorporation  of  an  assertion 
capability  into  existing  languages. 

A neAtion  and  Monitor  CapabiLLLLeA  | [on.  Ex-Utxng  Languages 

A number  of  new  languages  are  currently  being  investigated  by  researchers 
for  possible  use  in  proving  the  correctness  or  consistency  of  programs. 
Although  the  various  proof  techniques  differ  to  some  extent,  one  common 

I requirement  is  the  development  of  assertions  describing  the  nature  of 

the  algorithms.  The  program  provers  are  currently  exploring  techniques 
for  checking  the  consistency  of  these  assertions  in  a static  logical 
sense  for  specialized  languages. 


107 


The  assertion  concepts  for  programming  languages  which  are  now  being 
developed  constitute  major  extensions  to  our  ability  to  carry  out 
"systematic  programming".  These  new  assertion  concepts  impact  all 
phases  of  the  software  life  cycle  from  initial  requirements  and  design 
phases  down  through  certification,  and  maintenance  iterations.  These 
assertion  concepts  are  designed  to  encourage  the  development  of 
algorithmic  validation  criteria  as  the  implementation  evolves  from  the 
initial  algorithm  requirements  and  specifications  down  to  the  final 
program  code.  The  idea  can  be  illustrated  by  Figures  2 and  3 which 
show  how  the  point  at  which  validation  and/or  other  quality  criteria 
can  be  attached  directly  to  the  software  definitely  impacts  the  quality 
of  the  end  software  and  the  effort  needed  to  validate  and  maintain  that 
software. 


Point  at  Which  Specific  Quality  Criteria  are  Defined 
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ASSERTION  CONCEPTS  FOR  IMPROVING  SOFTWARE  QUALITY 


GeneAatizzjd  Local  A acAtion 

A generalized  local  assertion  may  be  embedded  in  a comment  at  any  point 
within  the  executable  code  of  a program  where  another  executable  statement 
may  appear.  The  local  assertion  is  designed  to  enhance  the  documentation 
of  critical  algorithms  throughout  the  entire  life  cycle  of  the  software. 
Dynamic  execution  time  checks  can  be  activated  at  selected  points  in  time 
to  ensure  that  the  actual  run-time  environment  is  consistent  with  the 
logical  state  specified  in  the  assertion.  This  dynamic  assertion  checking 
can  be  used  to  great  benefit  in  debugging,  validation,  certification,  and 
maintenance  of  complex  systems. 

The  format  of  the  generalized  local  assertion  is: 

ASSERT  LOCAL(extended-logical~expression )[optional-qualifiers ] 

[ control -opt ions  1 

The  exact  placement  and  treatment  of  the  assertions  will  be  tailored  to 
the  existing  language  facilities  in  currently  defined  languages.  In  these 
currently  available  languages  the  assertions  will  be  implemented  through 
specialized  comments  processed  by  a source  code  preprocessor.  New  language 
development  and  future  compilers  for  existing  languages  may  contain  options 
for  directly  implementing  the  assertions. 

Optional-Qualifiers 

In  order  to  provide  an  existential  and  universal  qualifier  notion  to  the 
generalized  local  assertion  an  optional  looping  capability  is  defined: 

...FOR  " [(variable-list)  (set  of  ranges/values)] 

WHERE  (qualif ier-controlling-logical-expr ) 

e.g.  /•  ASSENT  (X(I)-i*X(J))  FOR  ALL  (l,J)  (1:0)  WHERE  ( I -» =J )•/ 
means:  [¥I,J  such  that  1-1,  J-8  A I / j]  -*■  X(l)f*X(j) 

A 6 a inti  on  ContAol  Option* 

The  total  control  alluded  to  above  (i.e.,  ignoring  all  assertions  by  treating 
them  as  comments)  offers  the  user  a binary  choice  as  to  whether  or  not  to 
apply  dynamic  assertions  during  program  development;  however,  other  levels  of 
control  are  provided  within  the  assertion  language  itself. 

lua 
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The  assertion  language  itself  contains  three  hierarchical  levels  of  control: 

1)  instrumentation  control  - control  of  those  sets  of  assertions 
which  will  be  instrumented  at  a given  level  of  testing, 

2)  dynamic  control  - run-time  control  of  those  instrumented 
assertions  which  are  to  be  dynamically  checked,  and 

3)  threshold  control  - user  control  when  assertion  violations 
are  observed. 


Instrumentation  control  is  provided  by  a LEVEL  option.  The  syntax  of  this 
option  is: 

. . .LEVEL  (preprocessor-control-expression). . . . 

The  LEVEL  option  provides  information  to  the  preprocessor  telling  it  which 
sets  of  assertions  should  be  considered  for  dynamic  analysis.  This  level 
of  control  provides  a means  for  testing  selected  software  features  at 
various  points  within  the  software  development  cycle  and  fits  in  well  with 
the  top  down  approach  to  program  development.  This  also  allows  a user  to 
group  sets  of  assertions  together  for  various  types  of  dynamic  checks. 

Dynamic  control  is  provided  by  a CONDITION  option.  The  syntax  of  this 
option  is: 

. . .CONDITION  (dynaraic-control-expression ) . . . 

The  CONDITION  option  provides  run-time  control  of  the  assertions  which  have 
been  built  into  the  program.  This  option  effects  only  those  assertions 
which  have  been  actually  instrumented,  thus  the  CONDITION  option  is  of  lower 
priority  than  the  LEVEL  option.  It  should  be  also  noted  that  the  CONDITION 
option  can  be  dynamically  changed  under  program  control  to  activate  or  de- 
activate the  assertion  as  often  as  desired. 


Threshold  control  is  provided  by  a LIMIT  option.  The  syntax  of  this  option  is: 


. . . LIMIT  n( VIOLATIONS ] 


HALT 

EXIT  [VIA]  proc-name 
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The  LIMIT  option  provides  user  control  in  the  event  of  n violations  of 
the  corresponding  assertion.  The  user  can  specify  that  control  be 
transferred  to  a vrap-up  procedure  proc-name  if  the  EXIT  phrase  is 
specified.  Otherwise,  the  HALT  phrase  will  simply  terminate  execution 
and  generate  an  assertion  report  automatically  if  n assertion  violations 
are  encountered.  Motivated  by  a need  to  make  assertions  about  arrays 
as  well  as  scalars,  the  following  notation  has  been  adopted. 

Kuaxuj  Notation  AtAtrUioni, 

Two  areas  of  concern  immediately  arise  when  discussing  data  arrays,  namely, 
array  indices  and  array  values.  Thus,  if  one  is  monitoring  program  behavior, 
it  is  not  enough  to  monitor  array  values  alone,  since  program  logic  is 
invariably  concerned  with  where  these  values  are  stored  within  the  array. 

The  approach  is  to  generalize  the  assertion  and  monitor  capabilities 
to  include  data  arrays.  Array  notation  is  as  follows: 

Assume  an  array  of  the  form  A(I  ,1,1  ....  I ).  References  to 

v i n 

specific  subsets  of  array  values  or  array  indices  are  indicated 
by  A(I|, I2*  I3***iA  ) * where  is  a subrange  of  1...  This 
notation  is  position  dependent;  i.e.,  if  II,  is  not  referenced, 
its  position  must  be  indicated  by  an  asterisk  (*),  as  in 
A(l[,V3***I'  )•  The  format  of  each  Ij  is  where  £<^<1^. 
if  t=vi  then  the  pair  f;p  may  be  replaced  by  the  simple  token  w, 
as  . . . ) . Thus,  for  A(l0,20)  we  might  reference 

A(5,10:15) 

A(*  ,3) 

A(2:5,*) 

A(2:6,2:10) 
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Ex-tended  Logical  ExpncA&ion& 

Two  types  of  extended  local  operations  have  initially  been  defined  for 
the  assertion  language.  An  array  to  scalar  logical  operation  will  be 
allowed  with  its  result  being  defined  as  'true'  if  and  only  if  all 
component  to  scalar  operations  are  found  to  be  true.  An  array  to  array 
logical  operation  will  be  allowed  for  identically  specified  array 
cros3-sections  with  its  result  being  defined  to  be  true  if  and  only  if 
each  pairwise  component  operation  yields  a result  of  'true1. 

Local  tion  Examples 

A simple  local  assertion  example  is  shown  below  in  a typical  report  format. 

The  assertion  simply  indicates  that  at  the  point  where  it  is  inserted 
into  the  source  code  we  expect  the  value  of  the  variable  MOVE  to  be 
less  than  9.  The  report  format  indicates  that  this  assertion  was 
checked  9 times.  Violations  were  noted  on  the  6th  and  7th  executions 
of  the  assertion.  It  is  furthermore  noted  that  MOVE  actually  contained 
the  value  9 on  those  two  instances.  A snapshot  is  taken  of  all  pertinent 
variable  values  associated  with  the  violation  when  the  trace  mode  is 
specified. 

EXECUTION 

COUNT  SPECIFIC  EXECUTION  DATA 

ASSERTION  VIOLATIONS  2 

C ASSERT  LOCAL  (MOVE  .LT.  9)  LIMIT  10  9 

EXEC  NUMBER  VARIABLE  VALUE 

6 MOVE  9 

7 MOVE  9 

It  is  also  worth  noting  that  had  we  encountered  10  violations  we  would  have 
halted  execution  at  this  point  in  the  program. 

Examples  of  the  use  of  array  cross-sections  in  extended  logical  expressions 
include  the  following:  (assume  an  array  A(l0,20)  has  been  defined) 

(a)  ASSERT  LOCAL  (A(*,3)  .LT.  10)  LIMIT  6 VIOLATIONS 

(b)  ASSERT  LOCAL  (a(2:6,2:10)  .NE.  0) 

(c)  ASSERT  LOCAL  (A(*,*)  .GT.  0) 


In  (a),  the  value  of  each  array  element  whose  second  subscript=3  is 
checked  and  reported  as  a violation  if  its  value  is  not  less  than  10.  Ten 
array  values  will  be  checked  in  all.  Any  number  of  assertion  violations 
within  an  array  operation  cause  the  operation  to  be  counted  as  a single 
assertion  violation.  Thus,  the  HALT  ON  6 parameter  is  concerned  with 
only  the  number  of  invalid  operations  not  with  the  number  of  violations 
within  the  array. 

In  (b),  only  array  values  within  the  specified  subscript  ranges  are  checked 
for  an  assertion  violation.  In  (c)  all  array  values  are  checked  for  an 
assertion  violation. 

SpzcUa&ized  Local  A &&&ntionA 

A number  of  additional  specialized  local  assertions  are  proposed  to 
facilitate  the  expression  of  user  validation  criteria.  This  extensible 
attribute  of  the  local  assertion  concept  is  illustrated  by  the  following 
constructs: 

ASSERT  LOCAL  VALUE[S]  (variable-list)  ( set-of-legal-ranges-and/or-values ) . . . 

ASSERT  LOCAL  VALUE[S]  (variable-list)  NOT  (set-of-illegal-ranges-and/or-values) . . . 

ASSERT  LOCAL  VALUE[S]  (variable-list)  INVARIANT... 

ASSERT  LOCAL  SUBSCRIPT  RANGE  (list-of-array-specifications ) . . . 

ASSERT  LOCAL  ORDER  (array-cross-section)  1 1 ASCENDING  II  ... 

||  DESCENDING 1 1 

All  of  these  specialized  local  assertions  could  be  replaced  by  one  or  more 
generalized  local  assertions,  however,  their  existence  facilitates  the 
graceful  transition  from  program  requirements  and  their  associated  validation 
criteria  to  embedded  program  documentation  in  the  evolving  code. 

These  constructs  cause  instrumentation  to  be  generated  at  the  position  where 
they  occur  or  at  the  next  executable  statement.  Forms  three  and  four  will 
monitor  the  execution  of  the  following  statements  to  insure  that  it  does 
not  alter  the  value  of  an  invariant  variable  (e.g.,  through  side  effects 
from  subroutine  or  function  calls)  or  use  subscripts  outside  the  specified 
ranges. 
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The  ASSERT  ORDER  statement  checks  a sequence  of  array  values  as  follows: 
ASSERT  LOCAL  ORDER  (A(* .3 ) ) ASCENDING 

For  an  array  A(l0,20),  the  following  assertion  violation  summary  illustrates 
the  type  of  information  traced  for  a violation: 

EXECUTION 

COUNT  SPECIFIC  EXECUTION  DATA 


229  ASSERTION  VIOLATIONS  1 

EXEC  NUMBER  SEQUENCE  SNAPSHOT  VALUE 
18  A(7,3)  6 

A(0,3)  100* 

A(9,3)  8 


The  ASSERT  VALUES  statement  checks  variable  values  against  a specific  set 
of  ranges  and/or  values  to  assure  that  it  lies  within  the  desired  range. 
Distinct  ranges  are  specified  by  pairs  "min-expr :max-expr" . Each  range 
pair  or  distinct  value  within  a set  of  value  specifications  is  separated 
by  a comma.  For  example,  the  following  assertion  could  be  used  to 
check  the  value  of  the  variables  x and  y. 

/*  ASSERT  LOCAL  VALUES. (X,Y)  (0,3,5:7,11:13,21)*/ 

In  the  example  the  current  values  of  X and  Y would  be  compared  with  the  set: 

(0,3,5,6,7,11,12,13,21) 

Any  discrepancies  (i.e.  any  current  values  not  in  the  set  of  specified 
legal  values)  would  then  constitute  an  assertion  violation.  The 
ASSERT  SUBSCRIPT  RANGE  statement  will  check  addressing  on  those  arrays 
specified  to  ensure  that  only  those  portions  of  the  array  specifically 
selected  are  accessed.  A subsequent  example  will  illustrate  the  usefulness 
of  this  concept  later  in  this  paper.  All  of  these  latter  constructions 
will  result  in  providing  similar  traces  to  those  already  presented  for 
out  of  bound  conditions. 
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A TRACE  statement  is  available  to  allow  the  user  to  control  the  number 
of  execution  snapshots  reported  for  local  assertion  violations. 


The  format  of  the  TRACE  statement  is: 


TRACE 

FIRST 

LAST 

OFF 

. 

[VIOLATIONS] 


If  a TRACE  statement  is  not  coded  in  the  source  program,  detailed  traces 
for  local  assertion  violations  will  not  be  reported.  If  TRACE  statements 
are  coded,  the  first  TRACE  statement  encountered  causes  the  first  (or  last) 
n violations  of  subsequent  lo -al  assertions  to  include  snapshot  information. 
Any  subsequent  TRACE  statements  encountered  reset  the  values  specified  by 
the  previous  TRACE  statement.  TRACE  OFF  halts  the  reporting  of  execution 
snapshots  for  local  assertions  until  the  next  TRACE  statement  is  encountered. 


The  Conce.pt  a CftobaZ  A Attrition 

Expanding  our  notion  of  assertions,  we  immediately  identify  the  need  to 
expand  to  scope  of  application  for  our  asserted  program  properties.  In 
an  effort  to  avoid  requiring  several  similar  local  assertions  within  a 
particular  program  region,  the  concept  of  a global  assertion  has  been 
introduced.  This  is  a novel  approach  which  promises  to  have  a significant 
impact  on  the  way  we  design,  implement,  and  test  software. 


Global  assertions  will  allow  us  to  extend  our  capacity  to  inspect  certain 
behavioral  patterns  for  entire  program  modules,  selected  regions  of  modules, 
or  module  interfaces  (entries  and/or  exits).  Global  assertions  appear 
in  the  declaration  section  of  the  program  module. 


Formats  include: 


GLOBAL 

REGIONAL(region  name) 

ASSERT 

ENTRY 

m 

EXIT 

. 

(extended-logical-expression) . . . 


lib 


W 


■ GLOBAL 

ASSERT  REGIONAL 

ENTRY 
EXIT 

GLOBAL 

ASSERT  REGIONAL 

ASSERT  ENTRY 

EXIT 

These  global  assertions  will  have  effect  within  the  scope  defined  (i.e., 
globally  at  all  pertinent  points,  regionally  over  the  named  region, 
collectively  for  all  entries  and/or  all  exits.) 

The  VALUES  statement  inspects  each  specified  variable  as  its  value  changes 
and  reports  when:  (option  l)  the  new  value  is  not  one  of  the  specified 

legal  ranges  and/or  values,  or  (option  2)  the  new  value  assumes  a specified 
illegal  range  and/or  value,  or  (option  3)  checxs  to  make  sure  the  values 
of  the  selected  variables  are  preserved  (i.e.,  no  direct  or  externally 
caused  changes  are  permitted). 

The  ASSERT  SUBSCRIPT  RANGE  statement  verifies  that  array  subscripts  fall 
within  a specified  range  whenever  the  array  is  referenced  during  program 
execution.  It  should  be  noted  that  this  statement  provides  a means  for 
checking  portions  of  arrays  as  well  as  normal  upper  and  lower  bounds. 

For  this  reason,  it  is  more  powerful  than  the  PL/I  type  ON  SUBSCRIPT  RANGE 
check. 

Instrumentation  will  be  inserted  into  the  source  program  by  the  preprocessor 
to  accumulate  the  following  statistics  relative  to  assertion  violations: 

(l)  Identify  the  statement  that  caused  the  assertion  violation. 

For  that  statement  an  execution  count  and  violation 
execution  counts  identical  to  those  obtained  for  local 
assertions  are  reported. 
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(2)  The  actual  value  that  caused  the  violation.  This  value 
Is  linked  to  the  statistics  identified  in  (l)  above. 

9 

A GLOBAL  TRACE  statement  will  be  available  to  allow  the  user  to  control 
the  number  of  execution  counts  and  associated  data  values  reported  for 
global  assertion  violations.  The  format  of  this  statement  is: 


GLOBAL  TRACE 


' 

FIRST 

LAST 

OFF 

n 


[VIOLATIONS] 


Some  FORTRAN  examples  follow: 


20  DIMENSION  A (10, 20) 

21  C GLOBAL  TRACE  10  VIOLATIONS 

22  C ASSERT  VALUES  (l,J,K,L)  ( 0:100) 

23  C ASSERT  VALUES  ((II,LL)  (-10:10) 

2b  C ASSERT  VALUES  (KK,NN)  (2,1»,6,8,10) 

25  C ASSERT  SUBSCRIPT  RANGE  (A(*,3)) 

26  C ASSERT  VALUES  (X,Y,Z)  INVARIANT 


102  K = K + 1 

103  II  = A(L,J)  + LL 


23b  K = A(J,K)  1*100 

235  II  - II  ♦ 2 

236  NN  = KK* ( I-J ) 

j : 

* 

300  CALL  ROUTINEX (X ,Y ) 


If  assertion  violations  occurred  in  this  example,  the  following  statistics  are 
indicative  of  what  would  be  reported  by  the  Postprocessor: 
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Annotated  Program  Listing 

I 

102  K = K ♦ 1 


103  II  = A ( L , J ) LL 


23*4  K = A(J,K)  + I»100 


235  II  = II  + 2 


236  NN  = KK»(I-J) 


300  CALL  ROUTINEX(X.Y) 


Execution 

Count  Specific  Execution  Data 


511  ASSERTION  VI01JVTI0NS  1 

ASSERT  VALUE  (K)  (OjlOO) 

EXEC  NUMBER  VALUE 

10  101 

511  ASSERTION  VIOLATIONS  3 

ASSERT  VALUE  (n  ) (-10:10) 

EXEC  NUMBER  VAUJE 

22  20 

ASSERT  SUBSCRIPT  RANGE  (A(»,3)) 

EXEC  NUMBER  VALUE 

5 A(12,3) 

105  A(l,*4) 

125  ASSERTION  VIOLATIONS  I* 

ASSERT  VALUE  (K)  ^0:100) 

EXEC  NUMBER  VALUE 

52  101 

53  102 
ASSERT  SUBSCRIPT  RANGE  (A(*,3)) 

EXEC  NUMBER  VALUE 

52  A ( 5 , *• ) 

53  A(6,*4 ) 

125  ASSERTION  VIOLATIONS  1 

ASSEPT  VALUE  (ily)  (-10:10) 

EXEC  NUMBER  ' VALUE 

50  12 

38  ASSERTION  VIOLATIONS  1 

ASSERT  VALUE  (NN)  ( 2 ,*4 ,6  ,8 , 10 ) 

IOC  EC  NUMBER  VALUE 

20  7 

53  ASSERTION  VIOMTIONS  1 

ASSERT  VALUE  (X)  INVARIANT 

VALUE  OF  CALL  PARM  X 
EXEC  NUMBER  BEFORE  CALL  AFTER  CALI 
30  -10  -20 
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SUGGESTED  USE  OF  ASSERTIONS 


&t6tc  PluZoiophij 

Assertions  should  be  developed  and  incorporated  into  a program  during 
program  design  and  not  after-the-fact. 

A Suggested  ApfVt pack 

1)  Prior  to  generating  any  detailed  code. 

Write  down  a capsulized  program  specification 
containing  as  a minimum  the  following: 

a)  a statement  of  the  problem 

b)  a short  description  of  the  solution  techniques  involved 

c)  summarized  input  requirements 

d)  summarized  output  requirements 
*e|  tcAtuig/vaiLidcution  KequAAejne.ntA 
*f)  performance  requirements 

2)  Following  the  design  and  declaration  of  each  variable,  array, 
and  structure  include  a comment  defining  its  global  asserted 
attributes. 

Example 

DC!-  FIXED  BIN  1 15,0); 

/•ASSERT  VALUE  (I)  (0:8)  •/ 

DCL  ^ '1:100)  FIXED  BIN  (15, 0); 

/*  ASSERT  VALUE  (A (1:0) ) (0,1)  •/ 

/«  ASSERT  SUBSCRIPTRANGE  (A(  1:8)  )*/ 

3)  It  is  suggested  that  assertions  be  used  to  test  entry  and  exit 
conditions  for  all  internal  procedures. 

t)  It  is  further  recommended  that  they  be  applied  at  entry/exit 

to  internal  algorithms  within  individual  procedures  (e.g.,  before 
and  after  important  loop  structures  or  procedure  calls). 
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5)  Operations  or  key  algorithmic  logic  subject  to  possible  singularities 
should  be  proceeded  by  appropriate  assertions  (e.g.  A=B/Z  where  Z 
has  been  calculated  previously  should  be  proceeded  by  an  assertion 
checking  for  Z not  equal  to  0), 

6)  Use  as  desired  for  debugging  and  other  testing. 

A Sam  pic.  P'10  glam 

Specification 

Problem: 

Create  a program  to  solve  the  8-Queens  problem. 

Description: 

Given  an  8x8  chessboard  and  8 hostile  queens.  Find  a 
position  for  each  queen  (or  a total  board  configuration) 
where  no  queen  may  capture  any  other  queen. 

Inputs: 

*None 

Output : 

. 8x8  chessboard  configuration  describing  the  placement 
of  the  8 hostile  queens 
Test/Validation  Criteria 

. data  integrity  (ASSERT  8x8  board,  ASSERT  H QUEENS  8) 

. every  row,  column  contains  1 queen 
. diagonals,  contain  at  most  one  queen 
Performance  Criteria 

. solution  should  be  found  in  20  cpu  seconds  on  an  IBM  360/91. 

Program  Development 

Stepwise  generation  of  partial  solutions  using  a top-down  development 
scheme  might  produce  the  following  version  of  a program: 

QUEENS:  PROC; 

DECLARE 

board .pointer , safe ; 

dcrfajiiUtgAsttyjiAteAtioni;  /*  LEVEL  (2)  »/ 
conslder_f  irst_column; 
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LOOP:  DO  UNTIL  (last_column_done  V regress_out  of  first  col)- 

try -column; 

IF  safe 
THEN 
DO; 

setqueen; 

consider_next._column ; 

END; 

ELSE 

regress; 

END  LOOP; 

algox.^.lim<-C_ya.l-u{cUlon_ai a uiti on& ; /*  LEVEL  (U)  */ 

END  QUEENS; 

Although  the  final  solution  differs  in  many  respects  it  basically 
follows  N.  Wirth's  development  of  this  program  In  his  article 
"Program  Development  by  Stepwise  Refinement",  CACM,  Vol.  1*4,  No.  h 
(1971). 
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QUEENS:  PROC; 
DECLARE 
/•  ASSERT 
DECLARE 


I FIXED  BIN  (15,0); 

VALUE  (I)  (0:8)  LEVEL  (2)  V 
J FIXED  BIN  (15,0); 

/*  ASSERT  VALUE  (J)  (0:9)  LEVEL  (2)  •/ 

DECLARE  X(1:8)  FIXED  B»N  (15,0); 

/*  ASSERT  VALUE  (X(1:8»  ) (0:8)  L(2)  */ 

DECLARE  ( A(  1:8),  B(2:16),  C(-7:7)  ) FIXED  BIN  (15,0) 

INIT(  (8)  0,(15)  0,(15)  0); 

/*  ASSERT  VALUES  (AC),  B(*),C(*)  )(0,1)L(2)  «/ 

DECLARE  SAFE  FIXED  BIN(15,0) 

/*  ASSERT  VALUES(SAFE)  (0,3)L(2)  */ 

J=i; 

1=0; 

DO  UNTIL  (J  > 8 I J<  1); 

DO  UNTIL  (SAFE  =0  | 1=8); 

1=1+1; 

SAFE=A(I)+B(I+J)+C(I— J); 

END; 

IF  SAFE=0 
THEN 
BEGIN 

/•  ASSERT  LOCAL  VALUE  (l,J)(1:8)  LEVEL(5)  */ 

/•  ASSERT  LOCAL  VALUES(A(I),B(K+J),C(I-J)  ) (0)  L(5)  */ 
A(l)  = A(l)+1;  /* 

B(l+J)  = B(l+J)+1;  /*  SET  QUEEN 
C(l-J)  =C(I-J)+1;  /• 


*/ 

V 

*/ 


END; 


X(J)=I; 

J=J+1; 

1=0; 

END; 

ELSE 

CALL  REGRESS  (l,J,A,B,C,X); 


/• 

/• 


/* 

/• 


V 

V 

V 


ASSERT  LOCAL  VALUES  X(*)  ) (1:8)  LEVEL(4)  •/ 

ASSERT  (X(ll)  H = X(JJ))  FOR  ALL  (II,  JJ)  (1:8)  WHERE 
(II  1 = JJ)  L(4) 

ASSERT  LOCAL  VALUES  AC)  (1)  L(4) 

ASSERT  LOCAL  VALUES  (BC).CC)  ) (0.1)  LEVEL  (4) 

IF  J>8 
THEN 

PUT  DATA  (X); 

ELSE 

PUT  LIST  ('FAILURE'); 

STOP; 

END  QUEENS; 
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REGRESS:  PROCd,  J,  A,  B,  C,  X); 

DECLARE  (I,  J)  FIXED  BIN  (15,  0)- 
/*  ASSERT  VALUE  (I ) (0:8)  LEVEL(3)  •/ 

/*  ASSERT  VALUE  (J)  (0:9)  L(3)  •/ 

DECLARE  (A(  1:8),  B(2:16),  C(-7:7)  ) FIXED  BIN  (15, 0); 

/•  ASSERT  VALUE  (AC),  BC),  CO  ) (0, 1)  L<3)  •/ 

BEGIN; 

J=J-1;  /*  RECONSIDERPRIORCOL  •/ 

I F~» ( J < 1 ) /•  -iREGRES S OUT  OF  F I RST-COL  •/ 

THEN 

BEGIN; 

/•  ASSERT  LOCAL  VALUE  (I,  J)  (1:8)  LEVEU6)  •/ 

/*  ASSERT  LOCAL  VALUE  (A(l),  B(l  + J),  C(l-J)  ) (1)  L(6)  •/ 
l-X(J);  I* 

A(l)-0;  /•  •/ 

B(  I + J )=0;  /’REMOVE  QUEEN  ’/ 

C(|-J)-0;  /•  •/ 

IF  1=8 

THEN 
BEGIN 

J-J-l;  /*  RECONS  I DER_  PR  I OR  COL  •/ 

IF~i(J<  1)  /’iREGRESS  OUT  OF  FIRST  COL  •/ 

THEN 

DO; 

/*  ASSERT  LOCAL  VALUE  ( I,  J)(l:8)  LEVEL  (6)  •/ 

/•  ASSERT  LOCAL  VALUE  (A(l),  B( l + J),  C( l-J)(l)L(6)  •/ 
CX(J);  /•  */ 

A( I )=0;  /•  •/ 

B(l  + J)=0;  /*  REMOVE  QUEEN  */ 

C(  I ~ J )- 0 ; /•  •/ 

END; 

END; 

END; 

END; 

END  REGRESS; 
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APPENDIX  A - SAMPLE  PET  OUTPUT 


(XjfcAaceto 

The  Program  Evaluator  and  Tester  (PET)  is  a package  of  programs  designed  as 
an  automated  aid  to  assist  in  the  debugging,  testing  and  documenting  of 
computer  programs.  The  preprocessor  module  of  PET  operates  on  a FORTRAN 
source  deck,  gathering  information  on  source  statement  characteristics 
and  inserting  FORTRAN  code  for  automatically  gathering  run  time  statistics. 
It  generates  a modified  FORTRAN  program  containing  both  the  original 
source  code  and  the  inserted  source  code  for  later  compilation  and  execution 
of  an  ordinary  program  case.  Run  time  statistics  are  retrieved  at  program 
termination  or  under  user  control  by  a run  time  library.  Reports  are 
generated  in  a separate  step  by  the  PET  postprocessor. 

Run  time  statistics  that  may  be  gathered  include  statement  execution  and 
branch  counts,  min/max  and  first/last  values  of  assignment  statements 
and  DO-loop  parameters,  detailed  branching  counts  and  relative  subroutine 
execution  timing.  All  but  execution  and  branch  counts  are  optional. 

Source  statement  characteristics  gathered  include  percentages  of  executable 
and  non-executable  statements,  number  of  comment  statements  and  number 
of  statements  with  ANSI  standard  FORTRAN  violations  which  would  affect 
portability  of  the  program  to  other  FORTRAN  systems. 

PET  accepts  as  input  any  FORTRAN  deck  running  on  the  host  machine.  No 
program  modifications  are  necessary.  PET  is  currently  available  on 
the  CDC  6000  and  7000  series,  the  IBM  360  and  370  OS  systems,  and  the 
UN1VAC  1100  series(ASCII  FORTRAN  compiler). 
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Sample  PET  Reports 

The  reports  produced  by  the  PET  system  present  both  detailed  statement 
by  statement  statistics  as  well  as  program  summary  statistics. 

a.  Subroutine  Listing  and  Execution  Counts 

This  contains  specific  statement  execution  data  as  well  as  a profile  of  the 
subroutine.  The  execution  and  branch  counts  are  matched  back  with  the 
original  source  code  and  presented  in  the  form  of  an  annotated  source  listing. 

First,  the  subroutine  source  is  listed.  An  "N"  is  printed  to  the  left  of 
any  statement  that  violates  standard  FORTRAN.  To  the  right  is  the  specific 
execution  data:  execution  count  for  that  statement,  min/max  and  first/ 

last  values  for  assignment  statements  (if  those  options  were  indicated), 
and  specific  branching  information  for  applicable  statements.  True/false 
counts  are  reported  for  logical  IF  statements  and  branch  counts  for 
arithmetic  IF  and  assigned  or  computed  GO  TO  statements. 

Examples  of  the  report  for  branching  statements  follow: 

original  statement  execution  true  false 

count  count  count 

IF  (A  . EQ.  B)  GO  TO  100  100  TRUE  TO  FALSE  30 

# of  transfers  # of  transfers 

to  100  to  S00 

GOTO  ( 100,  200,  300),  N 30  BRANCH1  5 BRAHCH2  12 

BRANCH3  13 

H of  transfers 
to  300 

Figure  A-l  illustrates  a subroutine  listing  and  execution  count  page. 
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b.  Subroutine  Summor ies 

Following  the  listing  is  the  subroutine  profile.  The  syntactic  profile 
gives  the  total  number  of  statements  of  all  kinds  in  the  subroutine  and 
then  lists  statements  of  various  types .giving  the  number  and  percent  of 
each  in  the  subroutine. 

The  statement  types  and  their  PET  definition  Rre: 

All  FORTRAN  statements  except 
comment  and  nonexecutable 

COMMON .DIMENSION , Type  statements, 
DATA  ^ F.QU  I V AL  EN  C K , OVERLAY  Control 
Cords,  STATEMENT  FUNCTIONS,  FORMAT 

Statements  with  'C'  in  Column  1 

Statements  ♦not  violate  standard 
FORTRAN  (Figure  A-l ) 

Total  of  all  GOTO  and  IF  branches 

All  subroutine  CALL  statements. 
Function  subroutine  references  are- 
not  delineated. 

All  binary  read  and  write  statements 

All  read  and  write  statements  with  a 
format 

The  operational  profile  gives  the  total  execution  count  for  the  subroutine 
and  then,  by  statement  type,  lists  the  number  of  statements  that  were 
executed  and  the  percent  that  were  executed.  For  branch' i.e.  GOTO 
and  IF  statements),  it  gives  the  number  of  branches  that  were  taken 
and  the  percentage  of  the  possible  branches  that  were  taken. 

Figure  A-2  shows  a typical  summary  report. 


Executable  source 
Nonexecutable  source 

Comment 

Nonstandard 

BRANCH 

CALL 

UNFORMATTED  I/O 
FORMATTED  I/O 
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c . Program  Summary 

The  information  on  this  page  has  been  compiled  from  the  subroutine  profile 
into  a composite  profile  of  the  entire  program.  It  includes  a program 
syntactic  profile  and  an  operational  profile  in  the  format  described  in 
the  previous  section. 

d.  Subroutine  Operational  Summary 

This  page  summarizes  selected  subroutine  operational  statistics.  The 
information  gives  a picture  of  how  well  the  case(s)  that  were  executed 
actually  exercised  the  program  branches.  It  can  be  used  by  the  analyst  to 
design  additional  test  cases  to  exercise  other  areas  of  code  or  to  determine 
that  some  areas  of  the  code  can  never  be  executed. 


Figure  A-3 illustrates  a subroutine  operational  summary.  The  following 
explains  the  information  on  the  report  by  column. 


For  each  subroutine  that  was  executed  the  following  information  Is  given. 


Column 

A 

B 

C 


D 

E 


F 

C 


Information 
Subroutine  name 

Number  of  executable  statements  that  were  actually  executed 

Percent  of  the  total  executable  statements  that  were 
actually  executed.  Refer  to  the  subroutine  summary 
page  for  the  total  number  of  executable  statements. 

Humber  of  subroutine  calls  that  were  actually  executed. 

Percent  of  the  subroutine  calls  that  were  executed. 

Refer  to  the  subroutine  summary  page  for  the  total 
number  of  call  statements. 

Number  of  branches  that  were  actually  taken. 

Percent  of  the  possible  branches  that  were  actually  taken. 


The  above  information  Is  then  totaled  and  combined  with  the  syntactic 
statistics  for  the  subroutines  that  were  not  executed  to  give  the  program 
operational  summary  (ll). 


Following  the  summary  is  a listing  of  the  names  of  the  subroutines  that 
were  instrumented  but  not  executed  (I). 
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e.  Subroutine  Execution  Summary 

This  page  summarizes  the  execution  data  for  all  the  monitored  subroutines. 

It  gives  a profile  of  the  types  of  operations  actually  performed  by  the 
subroutine  (i.e.,  is  it  mainly  performing  I/O  or  logic  or  calling  other 
subroutines).  The  information  on  thiB  page  combined  with  the  timing 
information  described  in  the  next  section  can  help  the  analyst  determine 
where  a program  is  spending  its  time  and  how  it  could  be  improved.  Fig- 
ure A-'i  shows  a subroutine  execution  summary.  The  information  on 
the  report  is  explained  by  column  below. 

Column  Information 

A Subroutine  name 

B Number  of  times  the  subroutine  was  called 

C Total  execution  count  = sum  of  the  execution  counts  for 

each  statement 

D Total  call  count  = sum  of  the  execution  counts 

for  each  call  statement 

E The  percent  of  the  executions  that  were  subroutine 

calls  (i.e,,  total  call  count/total  execution  count) 

F Total  branch  count  = sum  of  the  execution  counts 

for  each  branch  statement 

G The  percent  of  the  executions  that  were  branches  (i.e., 

total  branch  count/total  execution  count) 

H Total  input /output  count  = total  execution  count 

for  formatted  and  unformatted  read  and  write  statements 
I The  percent  of  the  executions  that  were  I/O  (i.e.,  total  input 

output  count/total  execution  count). 
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f.  Subroutine  Tiraing  Summary 

The  statistics  for  subroutine  timing  are  gathered  by  accessing  the  system 
clock  at  all  entries  into  and  exits  from  a subroutine.  The  accumulation 
of  timing  statistics  is  turned  off  when  a subroutine  call  is  made  and 
when  a user-supplied  function  subroutine  is  referenced  in  the  first  term 
of  an  assignment  statement.  Note  that  timing  is  not  turned  off  when  a 
function  subroutine  is  referenced  in  a IF  statement,  thus  the  time  spent 
in  the  function  will  be  included  in  the  total  time  of  the  calling  subroutine. 
In  order  to  obtain  the  most  accurate  timing  history,  it  is  reconnended 
that  function  references  be  made  in  separate  assignment  statements.  Note 
also  that  time  spent  in  FORTRAN-supplied  functions  and  in  system  routines 
will  be  accumulated  in  the  calling  routine. 

The  timing  history  obtained  is  distorted  by  the  PET  instrumentation.  For 
this  reason,  it  is  recommended  that  when  an  accurate  timing  history  is 
desired  that  the  min/max  and  first/last  option  not  be  used.  However, 
experience  has  shown  that  the  relative  time  spent  in  subroutines  is 
not  significantly  distorted  by  the  PET  instrumentation. 

A function  subroutine  reference  will  be  considered  to  be  user-supplied  unless: 

1)  it  is  in  a table  of  FORTRAN-supplied  functions 

2)  it  is  a FORTRAN  intrinsic  function,  but  was  referenced 
in  an  external  or  type  statement. 
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The  various  types  of  timing  information  output  on  the  subroutine  timing 
summary  are  enumerated  below. 

. The  total  execution  time  for  the  case  is  the  elapsed  time 

from  entry  into  main  routine  to  exit  from  the  instrumented  programs. 

. The  total  time  in  monitored  routines  is  the  total  time  spent 
in  routines  for  which  option  32  was  turned  on. 

. The  total  time  in  other  routines  is  the  difference  between  the 
first  two  numbers. 

. The  histogram  of  subroutine  timing  is  output  for  those  subroutines 
for  which  the  timing  option  is  turned  on  and  which  accumulated 
-sometime  on  the  clock. 

. The  actual  time  and  the  percent  of  monitored  time  spent  in  each 
of  the  above  subroutines  is  output  to  the  right  of  the  histogram. 

. The  elapsed  time  spent  in  the  postprocessor  is  output. 

If  the  timing  option  was  not  used,  only  the  total  execution  time  for  the 
case  and  the  total  time  in  the  postprocessor  will  be  given. 

Figure  A-5 shows  a typical  timing  summary.  Subroutine  XUB  accumulated  the 
largest  amount  of  time,  so  it  was  printed  with  50  asterisks.  The  actual 
time  spent  in  XUB  was  31.926  seconds  and  the  percent  of  monitored  time 
v*s  31/71=35?.  Subroutine  PCUM  accumulated  P5.709  seconds  so  it  has 
i*0  asterisks  = 25.705/31.926x50. 
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APPENDIX  B 

SAMPLES  FROM  A PROPOSED  STRUCTURED  FORTRAN  ASSERTION  SYSTEM 
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Copy  aroilablo  to  DDC  does  not 
permit  fully  legible  reproduction 

INTRODUCTION 

JAVS,  for  JOVIAL  Automated  Verification  System,  is  an  integrated 
set  of  software  testing  tools  created  to  assist  in  testing  computer  pro- 
grams written  in  the  J3  dialect  of  JOVIAL.  RXVP  is  a similar,  but  more 
extensive,  set  of  tools  for  FORTRAN  and  IFTRAN  programs.  Both  systems 
are  derived  from  RSVP  which  was  originally  created  by  Dr.  Edward  F. 
Miller,  Jr.  and  Dr.  Michael  Paige.  JAVS  design  and  RXVP  development  were 
major  efforts  of  the  GRC  Program  Validation  Project  while  Dr.  Miller  was 
the  leader  of  that  group.  The  emphasis  of  this  description  will  be  JAVS, 
which  is  currently  being  applied  to  testing  a large  software  package  at 
RADC.  Unfortunately,  it  is  too  early  in  that  program  to  permit  me  to 
comment  on  the  contribution  of  JAVS  to  the  testing  effort.  The  results 
of  these  tests  will  be  published  as  an  RADC  report  early  next  year. 

In  Fig.  1,  I have  partitioned  the  universe  of  software  behavior- 
in  two  ways:  the  specified  and  unspecified,  and  the  acceptable  and 

unacceptable.  The  collective  experience  with  software  development  says 
that  all  four  of  these  forms  of  behavior  will  still  exist  at  the  time 
software  is  declared  ready  for  acceptance,  and  will  continue  to  exist 
after  acceptance  testing  is  over,  primarily  because  the  testing  process 
is  confined  to  examining  points  in  the  vector  space  of  the  input,  A 

t 'c>,ky*  v*  / . *■  ' ' - vo  ' - 

*■.  4 • « ft.-  - • I 1 ; 

In  a typical  software  test  activity  the  testing  group  is  attempting 

to  map  these  regions  by  probing  with  single-point  test  cases.  Their 
success  is  dependent  on  the  total  resources  devoted  to  exploring  the 


py  <mnlcble  to  DDC  does  not 
Si.  hilly  legible  repioducton 
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Figure  1.  Universe  of  Software  Behavior 


AN-47214 


universe  of  behavior  and  the  effectiveness  of  those  resources  in  terms 


of  the  location  of  the  tested  points  in  that  universe.  Test  tools  such 
as  JAVS  and  RXVP  are  designed  to  increase  the  effectiveness  of  the  test- 
ing process  and  to  replace  human  resources  with  computer  resources. 

Figure  2 shows  how  an  AVS  such  as  JAVS  or  RXVP  fits  into  the  test- 
ing activity.  The  specification  is  presumed  to  be  the  principal  source 
from  which  the  program  is  derived,  and  in  addition  the  source  of  a set  of 
functional  test  cases  and  acceptance  criteria  against  which  the  accept- 
ance tests  can  be  graded.  I also  indicate  in  this  diagram  the  usual 
situation  where  the  programmer  generates  source  code  that  not  'wt 
required  by  the  specification,  and  also  generates  some  test  data  that 
exercises  the  capability  so  provided.  All  test  cases  that  exist  when 
the  software  is  released  for  acceptance  testing  will  undoubtedly  lie  in 
the  acceptable  region.  The  acceptance  tester  will  be  charged  with 
generating  new  test  cases  (the  supplied  cases  have  already  been  learned 
by  the  software)  that  demonstrate  acceptable  behavior  while  spanning  the 
universe  of  total  behavior  to  identify  the  regions  of  unacceptable 
behavior. 

There  are  five  functions  performed  by  the  AVS  in  this  role: 

• Analysis  of  source  code  and  creation  of  a library  of  module 
data 

• Generation  of  reports  based  on  static  analysis  of  the  source 
code  that  reveal  existing  or  potential  problems  in  the  code, 
and  identify  the  software  structure 
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SPECIFICATION 


Figure  2.  Software  Analysis  and  Testing  Augmented  by  AVS 


i 


• Insertion  of  software  probes  into  the  source  code  that 
permit  data  collection  on  code  segments  executed  and  values 
computed  and  set  in  storage 

• Analysis  of  test  results  and  generation  of  reports 

• Generation  of  test  assistance  reports  to  aid  in  organizing 
the  testing  and  deriving  input  sets  for  particular  tests 

The  following  sections  describe  these  functions  as  they  exist  in  JAVS 
and  RXVP. 

MODULE  DATA  BASE 

An  important  feature  of  an  AVS  is  the  ability  to  handle  large 
software  systems  efficiently.  Both  JAVS  and  RXVP  build  libraries  of 
information  about  the  software  being  tested.  These  libraries  preserve 
an  internal  copy  of  the  source  code  and  the  tables  of  information  that 
result  from  the  various  analyses  of  the  source  code.  The  source  code 
and  tables  are  available  for  generation  of  reports  through  a set  of 
simple  interface  routines  that  provide  apparent  direct  access  to  any 
information  in  the  library.  A library  manager  component  handles  the 
necessary  transfers  between  mass  storage  and  the  core  region  referred 
to  as  working  storage.  The  collected  data  on  any  module  is  referenced 
by  the  module  name,  where  programs,  subroutines,  and  functions  are 
modules  in  FORTRAN  and  program,  procedures,  and  closes  are  modules  in 
JOVIAL.  Within  modules,  source  text  is  accessed  by  line  number  and  the 
various  tables  are  accessed  by  entry  and  word  number.  The  analysis  and 
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reporting  routines  have  a simple  interface  to  all  the  source  code  of  a 
large  system,  and  this  simple  interface  permits  easy  addition  of  new 
analyses  and  reports. 


STATIC  ANALYSIS 


One  approach  to  increasing  software  quality  (i.e.,  reducing  the 
region  of  unacceptable  behavior)  is  to  prohibit  the  use  of  certain  coding 
practices  that  are  prone  to  mistakes.  Where  the  compilers  do  not  enforce 
such  standards,  the  AVS  is  a convenient  technique  for  identifying  viola- 
tions of  standards  or  clear  mistakes  in  the  programming.  Static  analy- 
sis also  produces  reports  on  module  and  system  structure  that  would  nor- 
mally be  produced  one  time  as  a form  of  documentation.  A partial 
description  of  the  available  reports  from  both  JAVS  and  RXVP  is  as 
follows : 

• Multiple  Module  Cross  Reference 

This  report  lists  all  symbols  that  appear  in  all  modules  of  a 
library  and  for  each  module  where  the  symbol  appears  gives  the 
statement  number  and  flags  declarations,  and  the  statement 
numbers  of  statements  where  the  symbol  is  set.  This  report 
is  shown  in  Fig.  3. 

• Module  Invocation  Report 

This  report  lists  all  of  the  references  to  other  modules  for 
a single  module  in  addition  to  all  invocations  of  that  module 
from  other  modules  in  the  1 ibrary  giving  the  statement  number 
and  the  text  of  the  invocation.  In  RXVP  this  report  also 
includes  a comparison  of  the  actual  parameters  of  an  invoca- 
tion with  the  formal  parameters  of  the  module  description 
(see  Fig.  4). 

• Module  Dependencies 

There  are  four  static  reports  that  display  module  dependencies 
within  a library  or  selected  groups  of  modules  on  a library. 
The  invocation  tree  is  complete  to  the  extent  it  can  be 
derived  from  the  library  and  would  normally  be  produced  with 
respect  to  the  program  that  sits  at  the  top  of  the  hierarchy. 
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Figure  3.  Multi  Module  Cross  Reference  Report 
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The  other  three  reports  are  produced  in  matrix  form  and  are 
shown  in  Figs.  5,  6,  and  7.  Figure  5 is  an  invocation  matrix 
for  all  modules  on  a library.  Figure  6 shows  a similar  matrix 
for  all  invocations  from  a library  that  aren't  on  the  library 
data  base.  The  report  shown  in  Fig.  7 displays  the  internal 
interconnections  of  a selected  group  of  modules  as  well  as 
invocations  to  the  group  and  invocations  from  the  group. 


The  remaining  functions  of  JAVS  and  RXVP  are  those  that  support 
testing  of  the  execution.  These  components  support  automatic  instrumen- 
tation, data  collection,  data  analysis,  and  assistance  in  generating  new 
test  cases. 


AUTOMAT f C INSTRUMENTATION 

Both  JAVS  and  RXVP  automatically  insert  software  probes  such  that 

a record  can  be  made  of  the  statements  and  decision  outcomes  that  have 

OiaUL, 

occurred  during  execution  of  the  program.  i“his  form  of  instrumentation 
is  associated  with  the  smallest  logical  unit  of  code.  This  logical  unit 
is  identified  in  the  source  text  as  the  sequence  of  statements  lying 
between  two  decisions  and  is  called  a decision-to-decision  path  (DD-path) . 
The  instrumentation  component  of  the  AVS  performs  the  required  analysis, 
generates  the  text  of  the  probe,  adds  it  to  the  source  code  and,  if 
desired,  stores  the  instrumented  module  back  on  the  library.  An  instru- 
mented JOVIAL  module  is  normally  kept  in  source  form  on  the  library  since 
it  must  be  compiled  with  proper  variable  scope.  Instrumented  FORTRAN 
modules  are  typically  saved  as  compiled  object  code. 
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In  addition  to  the  decision-to-decision  path  instrumentation, 

RXVP  provides  full  parameter  instrumentation  in  which  case  data  is 
collected  on  each  variable  as  it  is  set,  on  DO  loop  parameters  as  used, 
and  on  logical  values  at  decisions.  JAVS  allows  the  user  to  insert 
instrumentation  through  which  data  is  collected  on  all  values  set  into 
selected  variables  or  on  values  which  do  not  fall  within  a range 
definition. 

TEST  EXECUTION  AND  DATA  ANALYSIS 

In  test  execution  the  instrumented  source  code  generated  by  the  AVS 
is  compiled,  loaded  with  the  data  collection  routines,  and  then  executed 
in  its  normal  test  environment.  During  execution,  probe  data  is  col- 
lected while  the  program  executes  the  test  cases. 

There  are  two  levels  of  detail  at  which  data  is  collected.  When 
the  desired  measure  of  test  effectiveness  is  DD-path  coverage,  path 
counts  are  collected  in  core  and  only  the  final  values  of  the  counts  are 
written  to  a file  which  is  input  to  the  test  analyzer.  If  sequence 
information  on  path  executions  is  desired,  a trace  file  may  be  written 
for  input  to  the  test  analyzer. 

The  function  of  the  test  analyzer  is  to  organize  the  execution 
data,  generate  reports,  and  accumulate  the  results  of  new  tests  into 
the  test  history.  The  basic  reports  are  geared  to  a test  objective  of 
executing  each  DD-path  at  least  once.  The  detailed  reports  show  path 
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counts  by  path  number  and  statement  counts  appended  directly  to  the 
statements  in  the  source  listing.  Summary  reports  provide  a list  of 
decision-to-decision  paths  not  hit  and  measures  of  test  effectiveness 
in  terms  of  DD-path  executions  as  a percentage  of  total  DD-paths  in  a 
module.  All  reports  are  generated  for  single  tests  and  test  accumulation. 

TESTING  ASSISTANCE 

The  capabilities  of  the  AVS  described  so  far  will  carry  a testing 
program  through  execution  of  existing  test  cases  and  provide  DD-path 

coverage  analysis  of  the  results.  Both  JAVS  and  RXVP  provide  additional 

0 £5  r £/*  A * t * ' 1 "i  f- 

reports  that  are  designed  to^nnnrnfe  new  test  cases  and  develop  the 
understanding  of  the  program  organization  that  is  essential  to  an  effi- 
cient application  of  the  human  resources  to  the  testing  process. 

A testing  approach  that  makes  use  of  these  functions  of  an  AVS  is 
shown  in  Fig.  8.  The  data  collection  and  analysis  provides  a measure  of 
testing  effectiveness  in  terms  of  coverage  of  DD-paths.  A minimum  test- 
ing goal  is  execution  of  all  DD-paths  in  a program.  This  approach  sug- 
gests minimizing  human  resources  with  automatic  generation  of  test 
cases.  At  the  simplest  level  the  new  test  cases  can  be  generated  by 
some  algorithm  that  systematically  explores  the  input  space.  If  a 
program  runs  without  catastrophic  failure,  such  a fully  automated 
approach  would  serve  well  in  an  environment  where  a computer  could  be 
dedicated  to  exercising  the  program.  If  there  were  a few  simple  cri- 
teria for  satisfactory  performance,  this  system  could  also  evaluate 
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Figure  8.  Program  Testing  with  AVS 
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performance  automatically.  In  our  work  with  BMDATC,  we  are  currently 
investigating  this  approach  where  the  source  of  unacceptable  performance 
is  inputs  that  are  beyond  the  capability  of  the  system  rather  than  mis- 
takes in  the  implementation. 

The  above  approach  does  not  necessarily  succeed  in  achieving  full 
coverage  of  all  DD-paths  within  some  reasonable  number  of  test  cases 
and,  of  course,  in  the  absence  of  a dedicated  machine,  Mime  may  be  com- 
pletely impossible  because  of  the  number  of  tests  generated.  One  of  the 

major  shortcomings  of  this  approach  is  that  it  does  not  provide  much 

additional  understanding  of  the  implementation  and  what  its  internal 
structure  is  capable  of.  Another  approach  that  uses  fewer  test  cases, 
but  which  presently  uses  more  human  resource,  is  to  direct  the  testing 
attention  to  particular  untested  DD-paths  and  attempt  to  generate  test 
cases  specifically  for  those  paths.  Figure  9 shows  a more  detailed  look 
at  the  steps  involved  in  that  approach.  The  AVS  provides  direction  for 
generating  these  test  cases  and  ideally  would  produce  them  automatically. 

The  present  versions  of  JAVS  and  RXVP  are  still  dependent  on  human 

intervention  to  derive  the  input  data  needed  to  cause  execution  of  a 

particular  DD-path.  This  is  greatly  facilitated,  however,  through  reports 
describing  the  decision  outcomes  and  computations  preceding  entry  to  the 
desired  DD-path. 
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Figure  9.  DD-Path  Testing 


AN-43132 


The  reports  are  the  reaching  set  report  (Fig.  10)  and  the 
sequence  report  (Fig.  11).  The  reaching  set  report  gives  the  ur 
all  DD-path  sequences  that  provide  connectivity  between  starting 
ending  DD-paths.  All  predicates  that  have  decision  outcomes  tha 
cause  departure  from  the  reaching  set  are  flagged  as  essential  a 
direct  the  tester  to  those  conditions  that  must  prevail  in  a tes 
to  reach  a particular  testing  target. 

RXVP  also  provides  the  DD-path  sequence  report  which  produ 
detail  on  a particular  sequence  of  DD-paths.  Figure  11  shows  th 
report,  which  includes  a composite  predicate  for  the  path  as  wel 
tables  of  variables  used  and  set  along  the  sequence.  Our  experi 
version  of  RXVP  also  provides  backtracking  capability  to  siraplif 
composite  predicates  and  give  more  detailed  information  on  how  v 
set  and  used  along  the  path  affect  the  results. 

Another  capability  that  is  supported  in  RXVP  is  conversion 
conventional  FORTRAN  into  structured  IFTRAN.  This  approach  has 
esting  possibilities  for  simplifying  test  case  generation  when  a 
to  reaching  sets.  Figure  12  shows  a somewhat  tortured  segment  o 
FORTRAN  and  Fig.  13  shows  the  same  code  rewritten  in  IFTRAN. 
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Figure  10.  Reaching  Set  Report 
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Figure  12.  Unstructured  Reaching  Set 
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Figure  13.  Structured  Reaching  Set 
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FUTURE  PLANS 


We  are  continuing  the  investigation  of  test  tools  that  are  useful 
for  large  software  system  testing  as  described  here.  A primary  source  of 
experience  to  support  those  investigations  will  be  the  current  testing 
being  performed  at  RADC.  The  results  of  that  effort  will  be  reported 
in  the  early  part  of  next  year.  In  addition  our  experimental  activities 
are  addressing  the  following  topics: 

• Automatic  documentation  aids 

• Test  effectiveness  measures 

• Automatic  test  case  generation 

• Automatic  structurizing 

• Static  analysis  reports 
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INTRODUCTION 


The  enormous  amount  of  attention  focused  on  issues  of  software 
reliability  and  effectiveness  in  the  last  decade  has  resulted  from 
recognition  of  "software"  as  a potentially  serious  problem  in 
computer-based  systems.  "Software"  has  taken  the  aura  of  the  cul- 
prit, and  remedial  measures  aimed  at  minimizing  the  chance  that  a 
serious  error  would  escape  unnoticed  have  demonstrated  their  effec- 
tiveness. Very  loosely,  these  measures  are  called  "verification 
and  validation"  (V&V)  strategies. 

At  the  outset,  ad  hoc  procedures  were  used,  with  startling  suc- 
cess. In  the  past  five  years,  the  body  of  techniques  has  become 
refined,  and  systematic  application  of  them  has  become  widespread. 
Almost  simultaneously,  and  perhaps  as  a result  of  the  need  identi- 
fied by  the  specific  requirements  of  the  technology  base  for  V&V, 
research  and  development  efforts  began  to  address  (and  develop)  a 
much  stronger  theoretical  and  methodological  base. 

The  present  state-of-the-art  can  be  characterized  as  having 
reached  a kind  of  plateau,  both  technologically  and  operationally. 

If  the  complexity  of  the  software  systems  for  which  V&V  is  to  be  per- 
formed remained  constant,  there  would  be  a (relatively)  stable  situa- 
tion. The  opposite  is  happening,  of  course:  as  software  systems 

grow  in  overall  complexity,  the  need  for  sophisticated  methods  for 
performing  the  V&V  function  grows  accordingly. 

The  purpose  of  this  paper  is  to  assess  the  current  state-of- 
the-art,  and  the  prospects  for  future  development,  of  three  subtech- 
nologies within  the  larger  arena  of  V&V: 

1.  Techniques  of  systematic  program  testing  as  they  can  be 
applied  de  facto  to  large-scale  (i.e.,  highly  sophisticated) 
software  systems. 

2.  Evolving  methods  of  software  design  and  production  that 
offer  the  opportunity  to  produce  bigger  software  systems 
more  quickly  and  more  reliably,  and  which  possess  fea- 
tures that  make  them  amenable  to  a wide  range  of  affirma- 
tion techniques. 
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3.  Methods  of  requirements  and  specification  analysis  and 
decomposition  that  offer  a secure  bridge  between  system 
concept  and  system  reality. 

The  discussion  is  partitioned  into  these  areas  because  each 
represents  a critical  part  of  an  overall  "solution"  to  the  software 
problem.  The  reasoning  that  supports  this  position  is  straightfor- 
ward : 

1.  A well  conceived  statement  of  system  requirements  that 
is  internally  self-consistent  forms  the  "standard" 
against  which  the  final  product  (i.e.,  programs)  can  be 
compared. 

2.  A well-designed  software  system  is  intrinsically  better 
suited  to  systematic  testing  as  a means  of  verifying  the 
consistency  between  the  requirements  and  the  final 
product . 

3.  Current  and  projected  developments  in  program  testing  ap- 
pear to  offer  the  surest  method  of  validating  the  relation 
between  a software  system's  requirements  and  its  final- 
product  form. 

Figure  1-1  illustrates  the  way  in  which  these  ideas  interact 
to  achieve  confidence  in  a software  system  as  the  "solution"  of 
the  problem  that  was  supposed  to  be  solved. 

Although  the  interactions  illustrated  in  Fig.  1-1  may  appear 
vague,  they  represent  a feasible  plan  for  achieving  truly  reliable 
software.  As  the  technology-oriented  material  in  the  next  sections 
will  indicate,  there  is  much  to  be  done  before  this  overall  plan 
can  become  reality. 

2.  PROGRAM  TESTING 

Procedures  for  testing  programs  have  undergone  a minor  revolu- 
tion in  the  past  five  years;  the  revolution  can  be  expected  to  con- 
tinue. Techniques  used  divide  into  two  natural  categories: 

a.  Systematic  de  facto  testing,  which  assumes  that  the 
software  system  being  analyzed  is  stable. 
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b.  "Reliable"  testing  methods,  which  seek  to  affirm  error- 
free  status  of  parts  of  a software  system  by  running 
specifically  chosen  tests. 

Methodologies,  and  corresponding  support  tools,  for  the  former  area 
are  well  developed;  the  theoretical  basis  for  the  latter  is  only  now 
being  unraveled. 

2.1  Structural  Based  Systematic  Testing 

Procedures  for  testing  programs  center  on  the  development  of 
a directed  graph  model  of  the  program(s)  (Ref.  1).  This  model  is 
used  to  indicate  a least-complex  set  of  subprogram  elements  that 
are  suitable  targets  for  actual  testing  coverage  measurement  (Ref.  2). 
When  every  part  of  the  program  has  been  executed  (and  when  that  exe- 
cution has  been  verified  by  independent  means)  one  can  be  certain 
that  no  latent  static  faults  reside  within  the  software  system. 

A major  problem  in  accomplishing  this  goal,  both  on  a single- 
module (or  unit)  level  and  on  a subsystem  level,  is  the  process  of 
constructing  new  testcases  that  make  the  program  (or  program  set) 
behave  in  a particular  way.  Manual,  semi-automatic  and  automatic 
techniques  (discussed  later) have  been  devised  to  accomplish  this. 

Experience  with  the  methodology  associated  with  systematic 
program  testing  (Ref.  3)  has,  so  far,  indicated  the  general  utility 
of  the  technique  but  has  not  achieved  all  of  the  results  expected. 

As  several  people  have  pointed  out  (Ref.  4,  for  example),  use  of 
this  approach  is  not  without  its  successes.  The  major  drawback  ap- 
pears to  be  the  cost  of  implementing  the  process. 

Higher  levels  of  testing  coverage,  involving  groups  of  modules 
and  possibly  taking  into  account  the  internal  structure  of  the  al- 
gorithms, have  been  suggested  (Ref.  5),  but  are  still  in  the  process 
of  being  implemented. 

2.2  Automation  in  Testing 

The  underlying  theme  of  this  species  of  formalized  testing  is 
"automation  of  function,"  which  is  done  for  two  reasons: 
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1.  To  reduce  the  costs  associated  with  applying  a methodology 
to  bearable  levels. 

2.  To  assure  that  the  systematic  testing  process  itself  does 
not  make  any  mistakes. 

Progress  in  automating  the  functions  has  been  substantial,  as  the 
brief  review  of  major  systems  given  next  will  serve  to  suggest. 

The  notion  of  a generalized  program  analysis  facility  was  first 
embodied  in  the  JAVS  system,  developed  for  RADC  (Ref.  6).  This  system 
processes  JOVIAL/ J3  programs  and  provides  basic  functions  of  instru- 
mentation, execution-history  data  collection,  and  machine-assisted 
support  for  generation  of  new  testcase  data.  A related  system,  RXVP, 
uses  the  same  techniques  to  process  systems  of  Fortran  programs 
(Ref.  7).  A number  of  other  systems — too  numerous  to  mention  specif i- 
cally--are  strong  adherents  to  the  basic  philosoDhy  established  by 
JAVS.  All  of  these  automated  tools,  which  might  be  considered  the 
"second  generation"  of  the  genre,  suffer  rimarily  from  high  execu- 
tion cost  and  lack  of  attention  to  the  user  interface.  This  is  cer- 
tainly not  improper,  since  what  is  currently  available  tends  to  be 
in  the  form  of  "R&D  prototypes."  The  body  of  experience  gained  from 
all  of  this  work  serves  as  a very  strong  basis  for  future  develop- 
ment . 

2.3  Integrated  Systems 

There  are  several  ongoing  activities  aimed  at  producing  machine- 
independent,  language-dialect-independent,  systems  of  automated 
program  analysis  tools.  In  one  activity  at  NASA,  for  example,  a full- 
spectrum  approach  is  being  applied  in  a way  that  culls  the  best 
methods  from  those  available  (Ref.  8)  in  other,  earlier  systems. 

Figure  2-1  describes  the  major  functions  that  such  systems 
typically  implement.  In  a current  development  activity  (Ref.  9)  a 
unified  attempt  is  being  made  to  develop  the  entire  spectrum  of  pro- 
gram analysis  functions  on  a non-proprietary  basis.  The  intention 
of  that  activity  is  to  have  specific  functions  selectable  by  a user 
(a  formal  Test  and  Evaluation  activity),  and  provide  a system  and 
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ATA  Automated  Testing  Analyzer.  Provides  the  basic  "window"  into 
the  program  testing  process... tel  Is  the  percentage  coverage 
In  terms  of  a very  simple  (yet  powerful)  measure. 

DAP  Dynamic  Assertion  Processor.  The  programmer  puts  assertions  about  the  way 
programs  are  supposed  to  behave  into  the  code  (they're  treated  as  comments 
by  the  compiler),  but  the  tool  makes  them  report  on  when  the  assertions 
are  violated. 

SMI  Self-Metering  Instrumentation.  The  programs  are  completely  instrumented 
(but  logical  integrity  is  preserved)  so  that  all  conceivably  interesting 
data  about  each  programs'  executi on-time  behavior  is  collected  and 
reported  (under  user  command). 

TDE  Testing  Difficulty  Estimator.  Tells,  according  to  detailed  program 
structure  analysis,  which  programs  in  a set  are  the  easiest  and  most 
difficult  to  test... and  helps  management  control  the  testing  resource 
better. 

ATG  Automated  Testcase  Guidance.  Helps  a user  construct  test  data  that 
meets  easily-stated  objectives  (...execute  this  path...). 

ATDG  Automated  Test  Data  Generation.  Automatically  generates  test  data 
that  meets  specific  objectives,  using  advanced  heuristic  processes 
that  have  high  success  ratios  for  this  very  difficult  problem. 

ATF  Automated  Test  Facility.  Constructs  a stubbed  and  standard  Input/Output 
environment  for  a single"  program  or  a set  of  programs. . .all  automatically. 

SAA  Static  Allegation  Analyzer.  Applies  advanced  techniques  to  analyze 

programs  statically  to  prove  important  allegations  about  the  programs' 
behavior  dynami cal ly. . . for  example,  that  "all  variables  are  set  before 
they  are  used." 

RTA  Reliable  Test  Analzyer.  Checks  out  the  reliability  of  a test  case  in 

protecting  against  assignment  and/or  control  errors thereby  helping 

to  maximize  the  sureity  attained  in  a rigorous  testing  activity. 

RLP  Robust  Language  Processor.  Automatically  rewrites  a program  so  that 

ft  "can1 1 fai 1 " during  execution  without  first  telling  the  user  exactly 
how  it  did  fail.  Used  during  debugging  and  checkout  testing  to  (a)  isolate 
mistakes  and  (b)  minimize  lost  execution  time. 

TPS  Test  Planning/Status.  A tool  for  testing  management  that  provides  continual 
status  checks  on  the  testing  activity,  and  an  archive  for  test  data. 

AMA  Automated  Modification  Analyzer.  Automatically  analyzes  a program  set  for 
the  potential  impact  a proposed  program  change  will  have  on  the  testing 
process. . .and  advises  what  re-testing  will  have  to  be  performed  as  a consequence 
of  the  change. 


Fig.  2-1. 


A typical  integrated  family  of  test  and  evaluation 
tools. 
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language-specific  "package"  tailored  to  the  user’s  actual 
Significant  attention  is  being  paid  to  the  "user  interface 
well  as  the  volume  of  analytic  output  produced. 

2.4  Automatic  Testcase  Generation 

The  issue  of  generating  new  testcase  data  has  receive 
attention  recently;  achieving  this  goal  seems  to  lie  at  th 
of  widespread  application  of  systematic,  automated  testing 
ogies.  There  have  been  many  notable  efforts,  but  only  a f 
The  general  problem  of  testdata  generation  is  known  to  be 
undecidable  (Ref.  10);  consequently,  any  technique  used  m 
heuristics  of  some  kind. 

A system  of  refining  new  testcase  data  from  known  inf 
about  existing  testcases  appears  to  be  the  most  effective 
(Ref.  11).  This  system  adapts  a testcase  for  which  the  ex 
behavior  is  already  known  to  a new  purpose — presumably  to 
testing  coverage  of  some  as  yet  untested  program  element, 
istics  required  appear  to  be  achievable  in  practice  for  pr 
programming  languages  such  as  FORTRAN  and  COBOL. 

Related  techniques  of  symbolic  execution  (Ref.  12,13) 
promise  so  long  as  the  user  is  interacting  with  the  proces 
alyzing  potential  program  flows.  Symbolic  execution  techn 
vide  the  manipulative  basis  for  dealing  with  the  complex  s 
of  "real"  programs  with  acceptable  execution  costs.  In  on 
effort  the  focus  is  on  symbolic  evaluation  methods  that  pe 
veloping  testing  coverage  data  for  assembly  language  progr 
reference  to  actual  execution  at  all  (Ref.  14). 

2.5  Reliable  Testing  Theory 

In  a landmark  paper  by  Goodenough  and  Gerhart  (Ref.  1 
shown  that  reliable  methods  of  program  testing  can  be  equi 
a formal  program  proof  of  correctness.  So  long  as  certain 
are  met,  it  appears  that  it  is  possible  to  devise  tests  th 
'proof"  against  many  classes  of  program  errors.  Howden  (R 
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has  assessed  the  utility  of  the  approach  and  found  (in  in 
terms)  that  in  fact  this  is  the  case  except  for  programs  i 
structural  faults.  Thus,  the  theory  suggests  that  it  sho 
sible  to  employ  testing  as  a means  for  certifying  program 
so  long  as  potential  program  errors  (within  the  classes  c« 
do  not  affect  the  structural  properties  of  the  algorithm  1 
plemented . 

The  reverse  of  this  argument  may  offer  the  solution 
ing  against  structural  flaws:  if  structural-based  reliab] 
are  effective  against  non-structural  errors,  then  non-strv 
testing  may  be  effective  against  structural  areas  (Ref.  2\ 

If  this  notion  is  fully  elaborated,  the  significance 
enormous  because  it  would  imply  the  existence  of  an  integi 
methodology  for  satisfactorily  testing  out  faults  in  a car 
software  system,  regardless  of  whether  they  are  structural 
structural.  An  important  area  for  investigation  is  the  ac 
of  automated  test  generation  techniques  to  the  production 
tests  only . 

2.6  Summary 

An  evidently  continuous  spectrum  of  methodologies,  ar 
quite  continuous  span  of  associated  automated  tools,  appes 
evolving.  The  ultimate  effect  will  be  to  systematize  the 
verifying  that  an  actual  software  system  indeed  satisfies 
tional  requirements  imposed  on  it.  The  "bridge"  between  z 
gram  tedt  and  requirements  text  appears  in  the  context  of 
to  partition  programs'  input  spaces  according  to  stated  re 
(see  below).  Once  this  is  accomplished,  establishing  the 
necessary  to  affirm  a system  designer's  reliance  on  the  ve 
the  software  would  be  reduced  to  a human  (and,  potentiallj 
assisted)  understanding  problem. 


3.  SOFTWARE  DESIGN/PRODUCTION 

The  second  ingredient  in  the  "formula"  suggested  in  Sec.  1 in- 
cludes the  use  of  advanced  techniques  of  software  design  and  produc- 
tion. It  is  necessary  to  distinguish  between  software  and  require- 
ments/specifications because  they  deal  with  different  entities:  soft- 

ware design  involves  the  incremental  refinement  of  a statement  of 
the  form  and  content  of  a set  of  programs,  whereas  requirements/ 
specification  analysis  must  be  performed  in  terms  of  abstractions 
of  (human  or  machine)  understanding  of  the  problem  being  solved. 

Advances  in  the  realm  of  machine-assisted  software  design  are 
only  now  being  reduced  to  practice,  following  a period  of  consolida- 
tion within  the  software  community  that  is  focusing  on  two  disci- 
plines: structured  programming  and  hierarchical  design. 

3.1  Structured  Programming 

The  computer  science  community's  love  affair  with  Structured 
Programming  probably  owes  its  existence  to  requirements  in  the  late 
60's  for  a more  widespread  technology  of  programming  to  apply  to 
then-current  development  projects.  The  basic  notions  of  structured 
programming  are  simple,  effective,  and  contagious. 

The  actual  focus  of  research  and  development,  has,  however,  been 
broader  than  simply  providing  a means  to  better  deal  with  programs 
proper.  As  the  RADC  Structured  Programming  Series  (Ref.  16)  suggests, 
much  more  is  involved.  It  can  be  assumed  that  the  body  of  informa- 
tion contained  in  that  Series  will  become  the  standard  interpretation 
of  the  meaning  of  Structured  Programming. 

* f 

An  important  part  of  the  new-style  process  for  developing  soft- 
ware designs  is  the  use  of  HIPO  (Hierarchical  Input /Process/Output ) 
charts  as  part  of  the  supporting  media  (Ref.  17,18).  Combined,  these 
mechanisms  provide  the  framework  for  a systematic  approach  to  design- 
ing software. 


3.2  Automation  in  Software  Design 

The  "translation"  of  a requirement  (or  specification)  into 
the  design  for  a working  peice  of  software  is  an  art,  at  least  as 
it  is  currently  practiced.  There  is  reason  to  believe  that  there 
never  can  be  a "science"  that  describes  the  design  process;  this  does 
not  mean  that  the  media  used,  and  the  transformations  between  them, 
cannot  be  treated  scientifically.  The  actual  conceptualization  of 
an  appropriate  algorithm  is  an  abstract  process  not  amenable  to 
rigorous  treatment. 

This,  if  one's  purpose  is  to  minimize  the  cost  of  software  design 
the  route  by  which  this  can  be  done  is  to  automate  the  processing  of 
the  media.  This  philosophy  meshes  well  with  the  facts  (as  they  are 
understood) : human  input  is  necessary  to  achieving  good  software 

design . 

Automated  Software  Production  systems,  such  as  are  currently 
being  developed  (Ref.  9),  attempt  to  (a)  minimize  the  low-level  effort 
required  to  capture  a good  software  design,  and  (b)  maximize  the  use- 
fulness of  every  input  provided  by  the  human.  For  example,  current 
concepts  suggest  that  the  overwhelming  majority  of  detailed  cross- 
referencing  and  other  low-level  tasks — those  that  can  be  performed 
by  machine — should  be  automated.  At  the  same  time,  a generalized 
database  supporting  the  software  design  process  assures  that  there  is 
a continuous  archive  of  the  entire  process.  Interactive  use  by  skilled 
software  designers  is  assumed.  Much  of  what  currently  is  acceptable 
as  program  documentation  is  produced  automatically,  after  the  design 
process  is  complete. 

3.3  Automation  in  Software  Production 

Present  software  production  processes  still  involve  significant 
manual  input;  there  is  still  a need  for  programmers  to  "debug"  pro- 
grams. (This  is  an  activity  distinguished  from  formal  testing  by 
the  fluid  state  of  the  program  text  at  that  point.)  Automating  the 
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actual  production  process  has  been  the  subject  of  numerous  papers 
and  reports.  Some  of  the  best  ideas  seem  to  be: 


) 

1.  The  use  of  a generalized  macroexpansion  capability  to  minimize 
the  need  for  repeating  often-performed  actions. 

2.  The  development  of  sophisticated  conditional  editing  func- 
tions, which  would  allow  generalized  programming  independent 
of  specific  machine  environments  and  would  automate  the 
process  of  producing  specific  program  texts  for  differing 
environments . 

3.  Self-hosting  of  the  software  production  system  (so  that  it 
is  used  to  produce  a working  copy  of  itself),  and  the  use 
of  "portable  programming"  methods  to  assure  intrinsic 
transportability.  FORTRAN  and  COBOL  subsets  are  envisioned 
as  practical  universal-implementation  language  bases. 

4.  REQUIREMENTS/SPECIFICATION  DECOMPOSITION 

The  paramount  problem  facing  the  V&V  community  is  the  develop- 
ment of  a rigorous  method  lor  evolving  a consistent  set  of  system 
requirements  (or  system  specifications),  particularly  as  they  apply 
to  the  embedded  computer  element.  Requirements/specifications  dif- 
fer from  a software  design  in  that  they  address  the  actual  real-world 
problem;  they  are  similar  in  that  iterative  decomposition  techniques, 
combined  with  automated  consistency  analysis,  can  be  used  to  develop 
them. 


There  have  been  few  actual  experiments  involving  the  use  of 
hierarchical  decomposition  techniques  in  developing  specifications 
(or  requirements)  for  computer  software  systems.  What  has  been  done 
is  promising.  For  example,  the  tools  provided  by  ISDOS  (Ref.  19) 
have  been  used  in  this  fashion. 

What  seems  to  be  lacking  in  this  area  is  (i)  a sufficient  level 
of  machine -based  transaction  processing  (each  new  intellectual  input 
represents  a single  transaction),  and  (ii)  the  basis  for  developing 
internal  consistency  checking. 
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4.1  Consistency  Checking 

The  cross-links  that  are  needed  for  automatic  consistency 
checking  can  be  inserted  manually,  provided  that  a syntax  appropriate 
to  the  task  is  provided.  If  this  is  accomplished,  the  database 
processing  needed  becomes  relatively  trivial.  One  project  (Ref. 

20)  appears  on  the  verge  of  accomplishing  this. 

The  outcomes  of  an  automated  consistency  check  procedure  are 
less  powerful  than  one  might  imagine,  but  hardly  powerless.  Although 
it  is  impossible  to  prove  the  internal  consistency  of  a set  of  re- 
quirements/specifications (Ref.  19),  the  capability  to  identify  many 
classes  of  inconsistencies  serves  to  strengthen  the  interpretable  valid- 
ity of  the  document . 

4.2  Natural  Language  Analysis 

The  other  method  is  to  employ  full-  or  restricted-Engl ish  in 
concert  with  a natural  language  processing  system  (Ref.  21).  In  this 
case,  one  would  transact  primarily  in  English  (or  near-English)  and 
the  associated  language  processor  would  develop  the  links  needed  for 
the  consistency  checking.  Current  advances  in  the  techniques  of 
parsing  English,  and  developing  alternative  semantic  tree  interpre- 
tations of  the  meaning  of  English  sentences,  have  the  promise  of 
being  highly  effective  in  this  particular  role. 

Current  research  appears  to  be  producing  a number  of  highly  auto- 
mated systems  which  implement  good  ( non-trivial ) algorithms,  and 
which  provide  significant  relief  from  the  logistics  of  developing, 
analyzing,  and  archiving  an  embryonic  requirement/specif ication  docu- 
ment. One  would  hope  that  techniques  of  program  decomposition  (Ref. 

• f 

5)  can  be  extended  to  stated  requirements,  so  that  a body  of  methodol- 
ogies for  analyzing  requirements — much  in  the  same  way  as  programs 
are  now  analyzed — can  be  used  to  resolve  ambiguities  effectively. 
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Advances  in  Technology  for  Verification  and  Validation 


Edward  F.  Miller,  Jr. 


OBJECTIVES  OF  THIS  TALK 


• ASSESSMENT  OF  CURRENT  PRACTICE 

• EVALUATION  OF  CURRENT  DEVELOPMENT  TRENDS 

• INDICATION  OF  FUTURE  GROWTH  DIRECTIONS 

I 


la 


DEFINITIONS 


VERIFICATION: 


VALIDATION: 


THE  PROCESS  OF  ASSURING  THAT  WHAT  IS 
INTENDED  TO  BE  PRESENT  IN  SOFTWARE  IS 
ACTUALLY  THERE. 

THE  PROCESS  OF  ASSURING  THAT  WHAT  IS  ACTUALLY 
PRESENT  IN  THE  SOFTWARE  SOLVES  THE  RIGHT 
PROBLEM, 


THREE  PRIMARY  AREAS  WITHIN  THE  LARGER  FRAMEWORK  OF  - 
, VERIFICATION  AND  VALIDATION: 

• SYSTEMATIC  PROGRAM  TESTING 

• AUTOMATION  IN  SOFTWARE  DESIGN/PRODUCTION 

• AUTOMATION  IN  REQUIREMENTS/ SPECIFICATION 
DECOMPOSITION 
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RELIABLE  TESTING  THEORY 


BASIC  REFERENCES:  GOODENOUGH  AND  GERHART,  "TOWARD  A THEORY 

OF  TEST  DATA  SELECTION" 

HOWDEN,  "RELIABILITY  OF  THE  PATH  ANALYSIS 
TESTING  STRATEGY" 


A PARAPHRASED  STATEMENT  OF  A BASIC  RESULT  ON  RELIABLE 
TESTING  THEORY: 


UNDER  CERTAIN  CONDITIONS  IT  IS  POSSIBLE  TO  CON- 
STRUCT TESTS  OF  PROGRAMS  SUCH  THAT  NON-STRUCTURAL 
ERRORS, CAN  BE  DISCOVERED  WHEN  THE  TESTS  ARE  EX- 
ECUTED 


Program  Analysis 


Symbolic  Execution  Inequality  Solution  Test  Data  Derivation 


UNANSWERED  QUESTIONS  WITH  RELIABLE  TESTING  THEORY 


• HOW  DOES  ONE  CONSTRUCT  RELIABLE  TESTS  FOR  PROGRAMS 
THAT  EFFECTIVELY  DETECT  STRUCTURAL  ERRORS? 

• HOW  DOES  ONE  GENERATE  TEST  DATA  THAT  CORRESPONDS 
TO  A RELIABLE  TEST? 

• HOW  DOES  ONE  DEAL  WITH  MULTI-MODULE  RELIABLE 
TESTS? 

• WHAT  IS  THE  IMPACT  OF  THIS  DEVELOPING  THEORY  ON 
THE  PRACTICAL  APPLICATION  OF  FORMAL  STRUCTURE- 
BASED  TESTING  METHODOLOGIES? 
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Difficult  To  Extend  To  Other  Dialects 


AUTOMATED  SOFTWARE  DESIGN/PRODUCTION: 


SOFTWARE  PRODUCTION  METHODOLOGY 


BASIC  INGREDIENTS:  GENERALIZED  DATABASE 

CURRENT  DESIGN/SOFTWARE  TEXTS 
ARCHIVE 

CONFIGURATION  CONTROL 

MACROEXPANSION 

MULTI-LEVEL  MACROS 
SELF- IMPLEMENTATION  MACROS 

CONDITIONAL  EDITING  FEATURES 

USER  PARAMETER  SETTINGS 

QUERY  RESOLUTION  (INTERACTIVE  MODE) 

PRODUCTION  CONTROL/STATUS 


AUTOMATED  SOFTWARE  DESIGN/PRODUCTION:  SOFTWARE  DESIGN  METHODOLOGY 


BASIC  INGREDIENTS:  HIERARCHICAL  DECOMPOSITION 

PROGRAM  STRUCTURE 
DATA  STRUCTURE 

PSEUDOCODING 

ALGORITHM  SELECTION 
ALGORITHM  STRUCTURE 

STRUCTURED  PROGRAMMING 

BASIC  CONSTRUCTS 
EXTENDED  CONSTRUCTS 

AUTOMATION  FEATURES 
CROSS-REFERENCING 

MANAGEMENT  REPORTING 
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AUTOMATED  REQUIREMENTS/ SPECIFICATION  DEVELOPMENT 


• EXISTING  SYSTEMS  (iSDOS,  ETC.) 

0 EXTENDED  AUTOMATION/INTERACTIVE  PROCESSING 

0 NATURAL  LANGUAGE  PROCESSING 
0 EXTERIOR  TEST  BEHAVIOR  SPECIFICATION 
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PBCCPAMM1NC  LA  AG FACE  CFSICH 
by 

Ann  E.  Farmer-Squires 
Eepartment  of  Eefense 
Fort  Cecrge  C.  I-eade,  I-aryiand  2C755 

AESTEACT 


In  developing  software  to  perforrr  highly  critical  control 
functions  or  to  process  classified  or  sensitive  information,  the 
ability  to  demonstrate  that  the  software  is  performing  its 
specified  function  correctly  and  that  the  integrity  of  the  data 
is  maintained  is  of  vltai  concern.  This  paper  shows  how  the 
goals  of  program  verification  and  data  security  have  influenced 
the  programming  language  to  be  used  for  software  development.  It. 
is  suggested  that  certain  programing  language  concepts  and 
mechanisms  (namely,  abstra ction , modularity  and  access  control) 
facilitate  achieving  these  goals  and  impact  the  design  of  the 
language  itself.  Abstract  date  types  are  suggested  as  an 
appropriate  vehicle  for  their  realization  in  a programming 
language.  This  thesis  is  supported  by  examples  of  several 
recent  languages  incorporating  the  notion  of  abstract  data  types 
that  have  been  designed  to  support  software  engineering 
technicues  and  to  facilitate  verification. 


KEY  WCFDE 


abstract  data  type,  abstraction,  access  control,  programming 
methodology,  programming  language,  protection,  reliability, 
security,  verification. 


IhTRCTtCTICN 


The  goal  cf  developing  soltware  that  satisfies  rigcrcus 
security  reovir events  and  that  can  be  formally  verified  to 
always  execute  in  accordance  with  precisely  stated 
specifications  of  its  tebavior  is  ambitious.  Tf  at  the  same  time 
the  software  is  net  be  to  prcssly  inefficient  or  highly  costly 
It  is  quite  ambitious.  Although  absolute  guarantees  are 
unlikely,  a degree  cf  confidence  sipnif leant ly  tetter  than  has 


t 
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been  traditionally  provided  that  tbe  software  perform?  as 
desired  is  possible.  Obviously,  n critical  feature  of  tbe 
software  is  tbe  programming  methodology  and  tbe  programming 
language  used  to  develop  it.  This  paper  identifies  several 
principles  and  concepts  playing  a key  role  in  programming 
language  design  that  hold  premise  of  helping  achieve  correct  and 
secure  software.  The  interaction  of  these  principles  and 
concepts  and  their  impact  on  the  software  and  the  language  in 
which  it  is  written  is  presented  in  this  paper. 


iFCLBTTY  Mfl.ICMiaS 


Tt  has  become  generally  recognized  that  designing  and 
implementing  software  and  then  testing  it  for  an  arbitrary 
subset  of  possible  inputs  to  cetect  any  errors  is  inadecuate  to 
demonstrate  that  tbe  program  meets  its  specif icaticns  and  is 
error-free.  Vhen  a subset  cf  these  sped  f ice  tions  describe  tbe 
security  requirements  that  tbe  software  is  to  meet,  it  is  dear 
that  this  traditional  method  of  development  is  aisc  inadecuate 
to  demonstrate  that  the  software  is  secure.  Security  used  in 
this  context  is  a subset  cf  the  total  system  reliability:  it 
refers  to  that  portion  of  tbe  software's  adherence  to 
specifications  dealing  with  the  flow  of  privileged  information. 
Naturally,  security  is  heavily  dependent  on  die  environment  in 
which  tbe  software  will  operate  and  on  tbe  application  for  which 
it  will  be  used.  For  the  purposes  of  this  discussion,  a secure 
program  win  t*  defined  as  one  vhicb 


( i ) 

performs  what  is 

intended 

(ii) 

performs  nothing 

else 

(iii) 

protects  itself 
unauthorized  modi 

and  tte  data,  it  is  handling  from 

fication,  disclosure  or  destruction, 

The  adherence  cl  a program  to  (ii)  and  (iii)  differentiates  a 
secure  program  frem  a reliable,  prof  ram  and  is  cf  vital  concern 
in  applications  involving  the  processing  and  controlling  of 
classified  or  sensitive  information. 

We  now  briefly  describe  two  typical  security  applications. 
Consider  an  application  which  processes  sensitive  data,  e.g.,  a 
file  containing  military  classified  data  or  "Privileged 
Personnel  Information".  For  various  reasons,  it  is  desirable  to 
have  omy  one  file  with  various  processes  operating  on  it  and 
sharing  the  data  contained  in  it.  It  is  also  desirable  not  to 
permit  uniform  access  to  the  file:  control  and  enforcement  cf 
selective  access  is  often  mere  appropriate. 

A second  application  is  software  which  is  controlling  a 
security  function,  e.g.,  message  segregation  and  routing.  In 
thi3  case,  a c<se,a  decision  is  made  and  subsecuent  action  is 
taken  based  cn  data  in  the  message.  Assuring  that  the  integrity 
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cl  the  data  is  maintained,  that  tt<  security  function  is 
operating  correctly  and  that  it  cermet  be  bypassed  are  cf 
critical  concern.  These  applications  will  help  focus  the 
discussion  and  should  provide  a useful  basis  for  considering 
seme  of  the  ideas  developed  later  in  the  paper. 


F CCUS  OK  PROGRAMMING  LANGUAGE 


Major  obstacles  in  achieving  reliable  and  secure  softvsre 
have  been  the  difficulties  involved  in  stating  precise  formal 
specif ications  lor  what  tip  software  is  to  dc,  formally  defining 
what  security  of  the  software  means  and  then  formally 
demonstrating  that  these  specifications  have  been  met.  Often 
security  reouirements  have  been  nebulous  and  left  to  implicit 
understanding  result,  inr  in  unant  icipsted  or  unacceptable 
software  benavicr.  For  the  typical  applications  briefly  defined 
earlier  a formal  definition  of  system  security  specifications 
appears  to  be  a dcabie  task:  however,  3f gnif ieantij  better 
support  from  the  programming  methodology  and  the  programming 
language  for  software  development  is  crucial. 

Attention  is  being  focused  on  programming  languages  for  the 
following  reasons: 

* A programming  language  is  the  vehicle  for  man/machine 
communication  used  to  evoke  computations.  There  Is  some 
evidence  that  the  design  cf  the  language  itself  determines  how 
the  computation  to  be  performed  viil  be  expressed.  Using  a 
language  wen-suited  to  the  problem,  permits  better  communication 
cf  the  functionality  of  the  computation  free  of  needless  detail. 
This  is  highly  desirable  for  security  applications  for  which 
verification  cf  correct  end  secure  cremation  is  deemed 
important. 

* It  is  becoming  clear  that,  the  language  chosen  for 
developing  software  is  vital  if  we  intend  to  say  something  about 
that  software  with  a high  degree  of  confidence.  Certain 
language  features  are  errer-prone  constricts:  other  language 
features  are  known  to  make  program  verification  difficult.  It 
has  also  been  claimed  that  the  ability  to  verify  programs  net 
written  with  the  intention  of  being  verified  or  to  verify 
programs  written  in  "classical"  languages  not.  designed  with  ease 
of  verification  in  mind  (without  suitable  restr lotions ) are 
unreasonable  gcals  for  program  verification  [Fisher  75 

* If  the  verification  of  program  properties  (such  as 
correctness  and  security)  is  to  become  a practical  reality, 
signif icantiy  better  language  support  is  necessary  to  keep  the 
task  manageable.  In  particular  more  adecuate  support  for 
incremental  design,  implementation  and  preef  and  for  program 
structuring  and  mod ular i2 ing  into  small  independent  pieces  such 


that  the  proof  may  be  carried  cut  on  a rr odul e-by-module  basis 
are  soire  of  the  key  issues. 

Several  software  concepts  have  been  identified  that  can 
facilitate  the  forirai  stating  and  verifying  of  secur ity-related 
properties  and  thus  help  achieve  the  goal  of  reliable  and  secure 
software.  These  key  concepts  are  abstraction,  irodularity  and 
access  control.  The  next  section  briefly  defines  their  and 
discusses  their  influence  on  the  design  of  programming  languages 
to  support  verification  and  security. 


ABSTRACTION , PCCULAFTTY  AND  ACCESS  COKTBCL 


The  principle  of  abstraction  is  a key  concept  in  controlling 
complexity  in  software  development . It  refers  tc  the  process  of 
identifying  essential  properties  that  are  common  tc  entities  and 
neglecting  their  inessential  details.  This  definition  of 
abstraction  implicity  includes  the  notion  of  information  hiding 
advocated  by  Earnas  [Parnas  7?b'  which  highlights  the  importance 
of  making  inessential  information  inaccessible . The  idea  of 
using  abstractions  in  developing  software  is  net  really  new. 
High  level  programming  languages  provide  some  degree  of 
abstraction  in  the  sense  that  one  need  not  program  at  the 
assembly  or  machine  language  level  and  be  concerned  with  details 
at  that  level.  Many  high  level  languages  also  provide  another 
effective  abstraction  mechanism,  namely,  routines  or  procedures, 
for  grouping  a secuence  of  language  statements  logically  to 
perform  a specific  function. 


The  abstraction  mechanism  found  in  conventional  languages 
enabling  a programmer  to  write  procedures  or  functions,  refer  to 
them  by  name  and  use  them  in  other  portions  of  his  program  does 
not  provide  a sufficiently  rich  set  cf  abstractions.  Functions 
and  procedures  correspond  to  what  has  been  termed  "abstract 
operations";  however,  they  operate  on  "concrete  data". 
Procedures  perform  operations  cn  data  that  are  tied  to  the 
representation  cf  that  data  and  are  often  meaningful  only  with 
respect  to  that,  particular  implementation.  If  the  data 
structure  implementation  is  modified,  ail  the  procedures 
operating  upon  it  frecuentiy  will  recuire  modification  to  deal 
with  the  new  implementation.  If  a language  abstraction 
facility  were  available  which  expresses  the  relationship  between 
these  abstract  operations  and  the  abstract  data  structure  (e.g., 
stack,  list,  binary  tree)  on  which  they  operate,  the  need  for 
changing  procedures  when  data  implementations  (concrete  data) 
are  changed  could  be  essentially  eliminated.  Unfortunately, 
languages  today  do  not  adecuateiy  provide  for  the  expression  cf 
thi3  relationship. 
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In  addition  to  ease  cf  program  rrcd  i f icat ion  , other  benefits 
could  ensue  Iron,  better  language  support  for  abstraction.  Fy 
associating  abstract,  operations  with  tbe  correspond ing  data 
structure,  it  should  be  easier  to  determine  whether  the 
abstraction  is  complete,  i.e.,  whether  ail  the  essential 
features  of  tbe  concept  being  abstracted  are  expiicity  covered 
and  the  concept  adequately  characterized.  Detection  of 
inconsistencies  or  inaccuracies  should  be  facilitated  since  the 
unimportant  detail  is  suppressed  and  the  important 
characterist ics  expiicity  expressed. 

The  abstraction  facilities  provided  by  the  language  greatly 
influences  the  ease  of  program  verification.  It  should  be  easier 
to  verify  properties  of  tbe  abstraction  independent  of  its 
concrete  realization  since  there  is  a well-defined  set  of 
abstract  operations  m anipul a t irip  the  abstract  data  structure.  In 
particular,  the  verification  cf  properties  related  to  the 
integrity  of  the  data  should  be  facilitated  because  "there  are 
les3  things  to  look  at  and  consider"  dealing  with  the  data  at 
that  level.  It  should  be  possible  to  verify  that  invariant 
properties  of  a data  structure  (e.g.,  the  mathematical 
characterization  cf  a binary  tree  or  a property  of  stacks  that 
pushing  an  element  onto  the  top  of  the  stack  followed  by  popping 
the  top  element  makes  visible  the  element  on  top  of  the  stack 
just  prior  to  tbe  push)  are  maintained  independent  of  what 
particular  impi ementation  is  chosen.  Cnee  the  properties  of  the 
abstraction  are  established  and  verified,  the  program 
verification  task  consists  cf  der onstrating  that  the  concrete 
representation  correctly  implements  tbe  abstraction.  There  is 
growing  evidence  that  this  two  level  process  makes  practical 
verification  mere  manageable  [Kulf  7 6,  Ambler  76b'. 

Eetter  language  support  for  abstraction  should  also  ease  the 
programming  task  in  that  inessential  implementation  detail  can 
be  ignored  during  the  initial  phases  of  software  development.  It 
also  would  encourage  concentrating  on  the  problem  solution  it: 
terms  relevant  to  the  problem  domain.  Of  course, 
implementation  issues  will  have  to  be  dealt  with:  however, 
allowing  those  decisions  to  be  deferred  until  tbe  appropriate 
time  should  ease  the  programming  task  and  keep  it  manapeabie . 

Abstraction  provides  little  value  for  practical  applications 
without  being  used  in  conjunction  with  modularity.  Fssential  to 
the  use  of  abstraction  is  the  assurance  that  appropriate 
abstractions  are  being  utilized.  The  concept  of  modularity  in 
software  development  can  help  guide  this  process.  Fesearch  on 
modularization  techniques  has  recently  focused  on  methods  of 
dividing  large  programs  into  small  independent  modules  in  such  a 
way  that  they  interact  only  in  predictable  and  precisely 
specified  ways  fWirth  7fi,  Farrias  72a).  The  issue  of  concern  in 
modularization  Is  net  in  keeping  modules  small  by  any  measure  as 
much  as  it  is  a cuestion  of  module  localization, 
non-interference  and  well-defined  interaction.  kany  of  these 
ideas  are  captured  in  the  modular  decomposition  process 
advocated  by  Fames  [Parnas  72b]  in  which  a program  is  developed 


as  a hierarchical  structure  cf  units  (modules),  each  unit  being 
the  realization  c f acre  abstraction  and  further  elaborating 
concepts  introduced  at  a higher  level.  The  idea  of  "hiding"  is 
further  advocated  by  Farrias  as  the  major  criterion  for  the 
decomposition  cf  the  program  into  appropriate  modules.  hiding 
makes  visible  (external  to  the  module)  only  those  properties 
necessary  to  interface  Kith  other  modules.  In  particular,  the 
caller  cf  a module  is  restricted  to  precisely  this  external 
interface  Information  and  not  to  the  details  of  the  module 
implementation . This  explicit  statement  of  properties  necessary 
to  users  of  a module  (including  when  a module  call  will  evoke  an 
error  condition)  should  facilitate  the  verification  that  the 
module  interaction  is  as  specified.  This  is  a key  ster  in 
verification  of  security  properties  of  a progran . 

The  concept  of  access  ccntrcl  , a term  drawn  from  research  in 
operating  systems,  is  the  third  key  idea  in  achieving  secure 
preprars.  Access  ccntrcl  mechanisms  are.  used  in  operating 
systems  to  ccntrcl  sharing  cf  data  and  to  protect  against 
unauthorized  access  to  data.  They  evpiicity  define  and  enforce 
constraints  on  the  sharing  of  date.  Current  programming 
languages  do  not  provide  adecuate  support  for  selective  and 
ccntr cued  sharing  cf  data.  Consider  the  typical  security 
applications  whose  characteristics  were  briefly  defined  earlier 
in  the  paper.  Iniform  access  to  a file  by  ail  tbe  program  units 
which  ere  to  manipulate  the  file  is  often  highly  undesirable. 
Assuring  that  the  integrity  cf  some  date  is  maintained  and  that 
programs  are  weii-trhaved  (exhibit  behavior  according  to  its 
specification  and  none  ether)  is  also  of  concern.  For  such 
applications,  incorporation  cf  an  access  control  mechanism  into 
a programming  language  to  explicitly  allow  controlled  and 
selective  access  to  data  should  help  facilitate  the  verification 
of  properties  needed  to  demonstrate  a program's  secure  behavior. 
A recent  paper  [Jones  7f.  illustrates  how  an  access  control 
facility  can  he  inccrporat ed  into  a class  of  programming 
languages  that  the  author's  call  "object-oriented"  languages. 
Such  languages  ailov  the  definition  of  abstract  data  types:  data 
is  viewed  as  an  abstract  object  and  associated  set  cf  abstract 
operations  for  manipulating  the  object.  F'enefits  gained  from 
such  a facility  are  described  in  that  paper. 

The  ration  cf  abstract  data  typrs  is  being  suggested  to  tie 
ail  cf  these  principles  together  and  realize  them  in  a 
programming  language.  V e feel  that  abstract  data  types  are  an 
appropriate  vehicle  for  expressing  abstractions , an  appropriate 
unit  cf  modularity  and  an  appropriate  vehicle  for  embedding  an 
access  control  mechanism,  in  a programming  language.  The  next 
section  defines  v/hct  is  meant  by  an  abstract  data  type. 

AESTACT,  DATA  TIFFS  AhD  TFFJF  IF PLFFE NTAT 1C f 

An  abstract  data  type  can  be  thought  of  as  an  extension  and 
itodif ication  cf  the  concept  of  data  type.  It  is  "abstract"  and 
differs  from  the  common  notion  cf  a data  type  in  that  an 


abstract  data  type  is  used  at  one  ievel  but  realized  at  a lower 
level.  The  definition  of  an  abstract  data  type  includes: 

(i)  definition  cf  the  operations  applicable  to  objects  of 
that  type 

(ii)  a declaration  cf  a representation  for  abstract  data 
objects 

(ill)  realization  cf  the  operations  on  the  representation. 

As  a very  simple  example,  consider  the  abstract  data  type 
"stack".  t possible  set  of  operations  that  are  meaningful  to 
objects  of  type  stack  consist  of 

push  pushes  an  element  onto  the  top  of  the  stack 

unless  fun 

pop  pops  the  top  element  off  the  stack  and  does 

ncthinr  if  the  stack  is  empty 
top  returns  the  value  of  the  top  element  of  the 

stack 

empty  returns  the  value  TFIF  if  and  only  if  the 

stack  is  empty 

(These  definitions  would  he  stated  more  formally  for  use  in 
verifying  the  correctness  of  (iii)  and  as  the  visible 
specification  for  stack.  They  are  stated  informally  here  in 
order  to  illustrate  the  concept.) 

This  set  of  operations  available  on  "ype  stack  completely 
defines  the  characteristics  of  its  behavior  and  captures  the 
abstraction  cf  stack.  These  operations  are  meaningful  for  the 
type  stack,  not  for  the  concrete  representation  cf  stack.  The 
second  portion  of  the  stack  definition  is  a deciaration  cf  a 
representation  for  objects  of  type  stack.  A typical  choice  is 
the  represe ntatio.n  as  eiem.ents  of  an  array.  Note  that  the  user 
of  an  abstract  type  stack  need  net  knev.  that  the  concrete 
represe ntation  cf  a stack  is  an  array.  The  user  of  the  abstract, 
type  should  be  unaffected  if  the  representation  is  changed  to  a 
list,  as  long  as  it.  conforms  to  the  stated  specifications  of  the 
stack  operations.  (ke  ace  assuming  a proper  implementation,  of 
course. ) 

The  concept  of  abstract  data  types  v.as  1 Lrst  implemented  in 
the  class  construct  in  Simula  67  [Dahl  f t ..  Sirryia  classes  were 
designed  to  represent  and  permit  unrestricted  accessibility  to 
data  objects.  Class  attributes  and  function's  are  accessible  in 

the  block  in  which  the  class  definition  is  embedded;  thus  the 

actual  form  of  the  data  represent.rt  ion  can  be  accessed.  Although 

protection  was  not  a feature  of  Simula  67,  several  protection 

mechanisms  have  been  suggested  as  language  extensions  [Hoare  7c, 
Paine  72,  Spitzen  71'. 

The  current  notion  cf  abstract  dat.  type  includes  the 
prctecticn  cf  the  data  re pre sentetion  from  direct  access  by  any 


operations  (e.g.  procedures  cr  functions)  net  defined  as  part  of 
the  type  definition.  This  feature  is  valuable  in  protecting 
the  representation  frerr  unauthorized  or  undesirable 
ffodlf ications  by  other  parts  of  the  program.  It  is  also 
important  in  aiding  the  verification  that  data  objects  maintain 
their  specified  properties  even  when  the  data  representation  is 
modified  (authorised  modification).  The  use  of  abstract  data 
types  to  facilitate  program  modifications  is  treated  more  fully 
by  Linden  [Linden  It  .. 

More  recently,  various  programs! rg  language  constructs  have 
been  suggested  to  support  modularization , to  deal  v.ith  data  type 
abstraction  in  a protected  and  controlled  manner,  and  to  support 
the  generation  of  veil-structured  programs.  Liskov  and  Zilles 
[Liskov  7*  ! have  developed  a language  CLU  that  implements 
function  clusters,  t cluster  can  guarantee  integrity  of  its  data 
objects  because  it  controls  their  creation  and  access  to  an 
object  is  only  threug)  operations  defined  by  t.he  cluster.  V*ulf 
has  introduced  the  concept  of  form,  a rechanisr  for  abstraction, 
into  Aiphard  [Vuif  ?*!,  ?(..  Forms  are  a somewhat  mere  genera! 
structure  than  clusters:  their  major  use  are  in  describing  an 
abstract  data,  structure,  the  operations  upon  it  and  the 
permitted  accesses  to  its  components.  The  language  Gypsy, 
developed  at  the  Luiversity  of  Texas,  is  a language  for 
specifying  and  implementing  programs  which  are  rigorously 
verifiable  [AirtJer  7(s  , TCb*.  Gypsy  uses  r.  type  unit  that 
includes  an  access  list  associated  with  the  object  tc  realize 
the  concept  of  abstract  data  types  and  explicitly  state  tbe 
access  to  their  internal  representa t icn . The  language  Fuclid 
was  designed  to  facilitate  the  development  of  verifiable  systems 
programs  [Lampson  7t..  hierarchical  nesting  of  program  units  is 
a feature  cf  Fuclid;  typp  definitions  are  used  with  import  and 
export  lists  (and  suitable  restrictions)  to  permit  the 
definition  cf  abstract  data  objects  and  to  restrict  access  to 
their  internal  representations.  ft  the  time  cf  this  writing, 
the  Fuclid  report  issued  was  in  draft,  form  and  net  to  be 
circulated.  Since  sore  cf  the  issues  related  tc  abstract  date 
types  may  reouire  modification  in  the  final  language  definition, 
the  report  will  net.  be  cuoted.  It  is  included  in  tbe  survey  to 
indicate  another  language  design  that  has  been  influenced  by  the 
goals  of  verification  and  security. 

L LU  was  recently  designed  at  FIT  as  a language  "to  support, 
structured  preg ramr ing" . Procedures,  which  support  functional 
abstractions , and  operation  clusters,  which  support  abstract 
data  types,  arc  provided  in  the  language.  The  compilation  of 
the  cluster  defines  the  meaning  of  the  type  and  the  operations 
on  that  type  (functions  in  the  cluster)  completely  characterize 
the  data  types.  The  representation  of  the  cluster  is  not 
accessible  outside  tbe  cluster.  I cluster  definition  iooks  like 
this: 

<ciuster-name> : cluster  [ (<cluster-Farameters> } * is 
<operator-ilst> 
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\»here 


<ciuster-name> 

<cluster-parameters> 


is 

<operetor-list> 


is  the  name  cf  the  abstract  type 
being  defined  by  the  cluster 
is  the  information  that  needs  to 
be  nade  available  v.hen  objects  of 
the  abstract  type  are  created 
indicates  that  the  data  type  defined 
corresponds  to  a set  of  operations 
is  the  set  of  operations  that  may 
be  performed  on  objects  of  that  type. 


The  other  information  provided  hy  the  cinster  definition  is: 


(i)  a description  cf  the  representation  cf  objects  of 
the  abstract  type 

(ii)  code  which  is  to  be  executed  when  the  objects  are 
created 

(iii)  set  of  operator  definitions. 


The  cluster  functions  ere  the  means  for  accessing  the  data  type 
representation.  [Liakcv  7 1 . 


Aipherd,  recently  developed  at  Carneg ie-Kelicn  University 
was  designed  to  support  the  development  cf  veil-structured 
programs  and  tc  facilitate  their  verification.  Forms  are  the 
abstraction  mechanism  provided  in  the  language.  A form 
definition  iccl-s  iik<  : 


forrr  <f  ortr-name  > (form-parameters)  r 
beginf orr 

spec! f ications  . . . : 
representation  . . . : 
implementation  . . . : 
end torn ; 


where 

<form-r)sme> 

< form-pa  ram e t ers> 

sped f icetions  . . . 
representation  . . . 


is  the  name  cf  the  abstract  type 
being  defined  by  the  form 
is  the  known  or  desired  information 
to  be  made  available  when  objects  of 
the  abstract  type  are  created 
is  the  information  about  the  form 
necessary  for  its  users  --  it  define 
a set  of  accesses  to  objects  of  the 
type  being  defined  by  the  form 
defines  the  concrete  representation 
and  reiated  properties  of  an  object 
of  the  at  street  type 
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irr. piementation  ...  is  the  definitions  of  the  functions 

that  may  be  applied  to  an  object  of 
the  abstract  type 

A key  feature  of  Aiphard  facilitating  the  verification  of 
properties  of  a lorrr  is  the  separation  cf  the  use  of  the 
abstract  type  defined  by  the  forrr  from  the  definition  of  its 
representation.  This  feature  permits  the  verif ication  cf  the 
form  independent  of  the  abstract  program  which  uses  the  form. 
This  keeps  the  verification  task  manageable  and  minimizes 
verification  resulting  from  program  changes  [Vmf  76!.  Similar 
benefits  can  te  cited  for  the  other  languages  described. 

Languages  such  as  J inula  67,  CLU  and  Alphard  have  been 
suggested  as  suitable  environments  for  entcdcing  an  access 
ccntrci  facility  (Jones  76..  Tn  order  to  incorporate  access 
control  in  the  langiage  (which  can  be  checked  at  compile -time ) a 
type  definition  is  extended  to  specify  a set  of  rights.  The 
user  cf  a type's  operation  must  have  the  appropriate  rights  to 
the  objects  passed  as  parameters  to  the  operation.  The 
definition  cf  the  proposed  access  control  facility  includes  a 
binding  rule  determining  how  variables  ore  bound  to  objects. 

Gypsy,  recently  developed  at  the  University  of  Texas,  was 
designed  as  the  cere  of  an  entire  methodology  for  the 
development  of  programs  which  can  be  designed,  specified, 
implemented  and  verified,  hey  features  are  the  integration  of 
programming  and  formal  specif ication  facilities  into  the 
language  and  complementary  approaches  to  verification:  formal 
proofs  that  tie  program  meets  its  specifications,  validation  cf 
specifications  by  run-tire  evaluation  and  a post -e xecut ion 
program  trace.  Pascal,  suitably  restricted  and  modified,  served 
as  the  basis  for  the  cfveioprent  of  Gypsy.  In  particular,  the 
hierarchical  nesting  cf  Fiscal  and  ncn-iccai  variables  were 
eliminated;  access  lists  vert  added.  Cypsy  deals  with  date 
abstraction,  modularity  end  access  ccntrci  in  the  following 
manner.  A program  consists  of  units  such  as  "routine"  and 
"type".  tc  cess  to  the  internal  rep-'e  se  ntr  ticn  cf  a type,  for 
example,  is  explicitly  stated  in  an  access  list  associated  with 
the  object.  The  absent  cf  ncn-iccai  variables  simplifies  the 
verification  task  and  aids  the  incremental  development  of 
programs.  ilmiiar  to  the  other  languages  discussed,  the 
separation  cf  the  use  cf  an  abstraction  from  its  implementation 
facilities  the  verification  process. 


CCrCLt'STOhS 


The  intent  of  this  survey  is  to  briefly  show  how  the  goal  of 
achieving  correct  and  secure  software  has  imr^cted  the  design  of 
programming  languages  for  developing  the  software.  Languages 
designed  to  expil’ity  support  the  construction  of 
well-structured  programs  and  to  ease  the  task  of  verifying  that, 
they  meet  their  formal  spec! f ication:-  nave  only  recently  come 


into  existence,  ferioue  exploitation  cf  these  languages  in 
representative  problem  domains  is  essential  before  any 
significant  conclusions  regarding  their  utility  can  be  drawn, 
hany  epen  questions  still  remain.  The  security  applications 
described  in  the  paper  are  suggested  as  one  possible  domain  cf 
investigation. 

Obviously,  the  principles  of  abstraction,  modularity  and 
access  ccntrcJ  can  be  applied  to  almost,  any  programming 
language,  however,  the  idea  is  to  provide  a language  which 
explicitly  supports  them  and  therefore  pneourages  their  use. 
(A  determined  programmer  can  circumvent  and  misuse  these 

facilities,  of  course.)  After  ail,  programming  is  a problem 
domain  as  valid  as  any  other  and  it  needs  appropriate  support 
from  a programming  language. 
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V.  iu  f 7 ^ 


W.  A.  Vuif 

ALFFAPD:  Toward  s Language  to  Support 
Structured  Program* 

Carneg ie-FeJ  ion  University,  Eeprrtrrent  of 
Computer  Science,  April  'K711 

Wuif  It  V.  A.  Vul  f , F.  L.  London  and  F'.  Shaw 

Abstraction  and  Verification  in  ALPFAFP: 
Tntrcduc t ion  to  Language  and  Kethodology 
Ca  rnep  ie-F  e lion  University,  Tepartirent  of 
Corputer  Science,  June  rclt 


SYMBOLIC  EYECUTION  : 
l\  COMPROKVWSE  BETWEEN 

TESTING  IMMD  VERIFICATION 


Tohm  A.  D&r«.in>ge«. 

IBM  T.T.  \MftTsoKi  ‘Re^eft«.H  Cewiec. 
Yorktoujm  Weights,  Nevw 


213 


PROBLEM : 
— — ■ ■»  - • 


HOU3  CAM  WG  BG  COMF1  D0MT 
(\  FR.OGR.fHM  VMML  PO 
U^WVT  UOG  OJAfOT  ■? 
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IDEALA-W  : 


"PRO v/e  \T  CORRECT" 


I 

require  •. 

• spec\F\cATioio  09  'Pesi6Mec.'^>  ttoTefOT 

• LOOP  INW 

• mooeu  of  ufk iGoftGE  seor\f\>o“ncs 

• PRO'S  L£W\  *DOW\fM^  XI^OVMLeOGE 

Provides- 

• mA'XVWOWV  COk}F\06MC£  TVlAT  SPeOFvCfYYlOO 
V*5  CORRECTLY  IMPLE WEMTE'Cb 

SOT  1*5*  TVm  *SPeCVF\CAT(ONi  CORRECT? 
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r 


t 


pgftcnce  - test  it 

/* 

,l  . . . •pfcesefoce  »o or  p^seaoce  •••  bot 
REQUIRES  LlTlie  ; 

• cow\pite>«.  em^ooi^  PL  seo\fHisj>Tvcs  Cexftcr 


• SPECVFVCfVTioO 

• iiogpev<f\/vrrs 

• ftOOvrioMKU  KMOl^UcOEE 


RU_  il^  TWE  flUNSD 
OP  THE  TETrn&^ 


PROVIDES  • 

. SCtfV\e  COt^Fl06K>CE  T'ttPtT  SPEC\P iCftROM 
ITS  COG.&ECTIM  u/V\PUEW\EMTEO 
• ooioFvoEN^ce  thatt  ^pecvpxcAtiom 

X5  SOOfOO 
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ALTGRMftTWS  : TEST  IT  SVOMSOUCftULT 


• CALL  PG.O<3&.PiiV\  VM ITtt  SWv/V\E>OUC  \JAlOE*> 

CAuu  C“A‘A  , “Kj’V'NiS^  , Vl>); 

• v arable  s H-fwe  formulae  a-e  \ialoe** 


LB 

l*S 

“»Oh  , UB  V«s  ,4M“+4- 

nr\^o  -= 

> Ci-atoaVi  } 

fV\  v 0 

i*b 

lb^ 

(Yu£t\  • 

J 

LB 

l*S 

“to"  +3> 

•COMD\TlOMAL  STATEMENT^  ffNA'-?  PCOOOCE 
Be  ARCHES  1*0  SSWX^OLVC  E-X6COTiOM  T(2£E 

\P  LB  — LB&  X>D  "Ti-YEO  ^EUSE  ^*2.  y 


• Ome  sW'msoLic 'te-st  tmPxcallw  cofca  e-spoi^fcs 
To  AM  MOMBeO-  OF  C0\Mv)E^T(0>0AL. 

TE5>TS  ?17 


gX’ftmp'-g  : BifOft'fcW  search  T 

A • ■ □ □ □ • □ x 

lb  mio  ue. 

SEARCH  : PROCEDURE  C IX  , UB,VJB , Xs) } 

dcu  f\  C*)  MOTEGee. , 

(LB,OB,X/IVWO)  IKVTE&ER., 

TOOMD  BlTj 
FooivJC)  - FAuSe  > 

Co  vomue  (u©  6 ob  ^ ifoomd)  •, 

V rmo  ft>/<oT  *-/  OMO  = CuB  + OB^/2.  J 

V look.  U6TT  */  IF  X<  A CM  >0  ) TrtEK)  VM2>*<VMO-l  ELSE 

x look.  e\&AT  */  if  x>At<vuo')  them  uB-wro-vi  ^euse 

» Foowd  iT  ♦ / FcOKJO  - TCvJE  ) 

ENO; 

Dl^pLAH  C Fo\)»o D , AU©)  > 

emo;  . 


w 


TEST  IMG  SSRiCCU 

• select  ■pft<cTvcouA«_  vJALues 

• ARKAS  u£MGTV\  UB> , VJB 

• \Muues  Klb'v  Mob'j 

• S€Pw2.ch  vftuog  y. 

• caul,  search 

• check  result  s 
•Repeat  vxiHtte  %>o 

• EACH  TEST  CHECKS  OKiE  pOiKST  VM  (MPVJT  SPACE 
THAT  VS  AeSVTG-AP.\uW  UPW&66  \M  seJE«.A<- 
DuM-ENSlOIOS 


INPUT  SPAC.6 

n? 


SS’m&OUCflLLV  T£ST>M6  SSftfeCW 

Fiveo  length  ftei toy 


. ’t 


FtVfcD  LENGTH  ftfcfcfW 

CAu.  SiftrZCrt  I“Jl‘A  'W  “M'»4  VN:  LB 

VS 

FOOttO 

fMO 

PC 

mA 

• 

• 

TfcUfc 

PovJMO  - FAt^e ; 

N 

kk4- 

FAU^e 

• 

-r«oe 

Do  U>  V L6  Cub  ^ VB  It  I FoOMO  ^ 

TR06 

nrwo  = C.LBtoe>V2. 

NJ 

m4 

FAU&6 

NH2. 

Teoe 

\f  y<  kCwtcO  xvvtM  ... ' 

ftssums  Tcue 

K) 

N+4 

pause 

*<a(nmO 

03  - rniO-l  ) 

N 

M+\ 

FAUS'C 

AHt 

Do  LVi  l-ULS.  CuB  s.vjGj*fTP>-^0)- 

STILC  Tfioe 

tvuo  - CL3+oe»Vi> 

N 

N-H 

FALSE. 

N 

i 

\F  ><.  A0vuO)Ttt£fM  • • • 

A»OMe  fa  use 

M 

M+J 

FAC%€ 

fO 

i 

l 

i 

IF  ^ > fV(.M»O^Ttt€V>  ••  • 

Assurvit  FAtse 

M 

N-H 

FALSE 

N 

1 

xtMU-fOv 

* » felta) 

FO'JMO'Trz.06; 

M 

N+t 

-rfcoe 

N 

S^<w6 

do  oivuus  Cue»<roe»t-Trov3^^- 

FAuse 

DtSPuAM  t FoouD.,  >or\\Oj; 

M 

N*l 

T(Z\J£ 

hi 

X4*UHt>r 

SVmfcOUC.  EXECOTIOIO  TKEE  (_1€m&th=5) 


X- 


X'-  fN(w} 


X-.  K (wv^ 


x- k(m+o 


X’-  PiCw-tA) 


FOUIUO 


FookjOI  |Ftx*)D 


w-h 


KJ+4- 


li  svmeouc  Tevrs  exhrost  itopot  space  CteM&r* »s' 


© © (S3 

X»  n (u> 


C<AM  ©O  7Hl?>  PCXZ.  Fixeo  V^fO&TH  PrtiG.A'-l 


IJOH/VT  ftBOOT  -PtRSVTgftW  \.EMG-TV>  ftg.RfW  ? 
ewmiioe  top  of  iniFinvte  exscotiom  t£ee 


CftU-  SEARCH  C'.V'A  /US",  "OS';  V ■)  • 


IMFofcrnau  \N>oocT i pro 

PERHAPS  TE-STEC-  UHU.  DETECT  PftTTE^KiS  . 
IIO  THE  EXEOJTVONi  TfcEE  ftMO  NYftVC.G 
mOvJCTWE  ft£GOW\EMTE. 

< UB-LB+t > 

LB  OB 


• • • » • • | | 


Such  *. 

• SBft&CM  FOV2-  UcNiGrTH  S* 

• \V  IT  UOOfcXS  FO<2-  LB(ViG*m 

T>tQO  \T  00\U.  U0O«.VL  V&L  U6Ki6TV\  00>-LB*t\ 

• TH-t^eFoee  • • • 


??1 


StMftfcW  SgftfcCU  : 

• \KiPur  PtSSeR-TlOrO 

V\  £ CL^‘ *•  OS- \1  ftCO^ftCi+0  k S>o«.Teo 

• OOTPur  ftSSERTlO/O 

& = + k ON3C.HAM &CO 

X * * ^ X.  0WC*Al0<y«0 

FOOMD  X*  ftrC.iCY\v&)  ^r 
“ipOVJfJO  ^ M\  € ClB/%.  OB'!  X^  f\('«) 

• loop  i*wA*2.\ftKrr 

X * 

V\  € CL^':  LB-l*i  X^kC'^  -V*  X WOT  LEFT  OP  LB 

Vi  e CvJB-ti  • oe'l  xy  fUO  1r  x mot  &ig*t  op  ob 

V\  c tts'-  oo'-il  fttO^  (Ui+f) 

• U€/V\(V\AS 

V;  € Clb  •OB-v’i  f\CO  — ^ 

x < A(j^  *v  Lv-B-.oe>-il 

» V;  € (y.Oft-l'i  X * Pit'*) 


w 


BeKitPir 


T€^>r 

~revr 


COST 


SVWV&OLIC  TFSTIN3G 

• 'PfcOVhOfcS  PrtOttfc  COK^F\OeMCE  TWO  T^SThMG, 

Bv>T  eeQvivde’S  P>  S4fne>OuC  iKyre&pvieTeK- . 

• ,Re<$o\9.e*s  mo  Po^m^vL,  spec\p\  cation  oo. 

ir^\Jpi^-\fr»OTS  , BA>T  FdoOFS  fKlS  I KiR>£rt\rtO 


2?  5 


A swr^Bouc  cftto  *p£o\j»de 

A ^eECT*.0\/V\  OF  SE^VhCES  . 

1EFF \GVl  - f\  FeoroTWPe  smvfeaa 

Fe06<2-P»^WVA^.>N3G  l^M6QAG€  CFL/r-lvvce^) 

• \N3TE6eeS/  B ITS  ^SCLPrUVrt-  , \sTV¥UC,^TO^ 

.+■-*-/•*#  fYXOO  ftfBS  i l 1 

• P\*S>SlGKWV\£*-3T 

. ip  . . . them  . . - Etse 

• Po  • • • 

• \TE<iATWE  T>0  , T>0  VJJHUE  ••• 

• go  to  pvKiO  uabel-s 

• peocEOvxie^  eec  o<iswe 

PU/r  FAG-AMETEYL  COMNJ  eMTVC)MS 

• TG.ACB 

• e>G.eAvc.  poiiots 


STATE. 


EF RG\; 

symiaouc  e~men^oMs 

• cotosTPiiors  'V , "Aec."  A , "ex" 

• PATrt  S6u£CT\OM 

So  TdOE  , 60  FALSE 

Assume  C?') , so 

• DlSPUfW 


services 


• call  pC\,3"}; 

• caul,  p(."*W); 

• CAuu  PC'Vj  s}; 

• VER\FW  \)P; 

• TEVr  (Too')  p; 


COKi\J  £NiT  lOKi  PrU  exeOJTiOM 
SMrvM^ouc  execution 
oruxe  c> 

VEftVF  \CPTiOM 

DEPTM  UWWTSO  6EK)Eftf?U0»C 
OF  EXECOTtOM  Tft-EE 


• COV Eft.  FlfoO  COVER.At&E  OP 

Test  efvut 
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TEST  C. ftST  CC\) tCA&c 


• Gvv/EiO  R SET  OP  test  crises  \T;1 
USE  COVER.  To  T=  I MO  •PrZ.eDvOfVTeS 

• IF  P'i  - P;  "THeM  "BOTH  "TESTS  1=OU_OiM  SPnME 

o 

COMT^Cu  Flo  uJ  PPrVrt  } W'Sollc  TEST 

M 

• prr  V P:  \MOtOVnE^>  COJeiZ-PKEt^.  OF  TCSTS 

Wi 


“TEST  C.ftF/F.  G-F-NiS-fcPTTloO 


• “I  P lK)Ov.O\TF.S  OF  IfOpOT  SpAO&  K30T 

CO\IBiZ££) 

• F\fOO  SOUUTlorO  “TO  "IP  FO<2-  TF^Y 


228 


suonmft&v 


• svfa&ouic  t6st»og  p*L."retzAi<VT\\je 

TO  TEt>TiA OG-  V£<$.\F\OPiTlOfO 

• Pt  sv^v\fi>oL.tc  uoTGv2pR.ere.^2>  CAM  ^<zo\J\Oe 

ft  ^PeCTi^OVW  OF  Jvces  in^CjuJO\\>  G 

T€VYiN>6  PiNO  \(ee\F\CATiOM 


CUReSMT  COtOCS'g.KJS  : 

MMI  ■ -II  . ■—■  — ■ — ■—  — * ■ I ■ ■ ■ ■ I ■ ■ 

LftRGkl  ClftSS  of  DPirTA  TwPES 

• CAPW£ACTErt.S 

• PoiOTE/LS  , \2>*SEO  STYiOCTOfcES 

N3  0T  S»v)T  f>OfVM  MOT  T6U-  S>AOgH  7 

S^m^OUC  SXECUTlQ^  OF  SfiEGVFlCATlQAQS 

• cnoxze  economic  v/e^v^vcATvoM  #e*ec.oTioM 

• EftfclAErt-  V^<2,Vp\CArTlOrso  ; EXCCOTlOM 

PftoceouaftL. 

EFFIGY  CftM  O^e  \/o  Spec  \>0  xjeiivPvGATtOOO^ 
eXFGUT LO  M 

frBSTO.  frCT  IQK>  H\E 
UDOVClfOG  fvr 

•tfckn\q\>e.s  Fort-  specAFMitob  0*T<\  PfesroACTao 

• e*ecoTLOKi  of  -mess  spgc\fvcauc*os 

• P£onj\m6  \jojoeo.  L0aeu  coos  Pji^o  specs. 
co<2-CLecTuM  iyv\9veMovoT  hvghsvz-  sev/su 
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SELF-METRIC  AND  SELF- CHECKING  SOFTWARE  AND  VALIDATION 


i< 


S.  S.  Yau  and  R.  C.  Cheung 

Department  of  Computer  Sciences 
Northwestern  University 
Evanston,  Illinois  60201 

Summary 

The  knowledge  of  the  dynamic  behavior  of  a program  is  extremely  useful 
to  its  validation,  especially  for  a large-scale  program.  Many  of  the  cur- 
rent limitations  on  software  validation  and  performance  evaluation  (perfor- 
mance testing  may  be  considered  a part  of  validation)  are  due  to  the  lack 
of  understanding  of  the  dynamic  behavior  of  the  software  because  the  behavior 
of  a large-scale  program  is  deta-dependent  and  environment-dependent.  It 
is  desirable  for  a program  to  be  able  to  automatically  measure  its  own  dyna- 
mic behavior.  A program  with  such  capabilities  is  called  a piece  of  self- 
1 2,3 

metric  software.  ’ Self-metric  software  is  useful  for  evaluation  of 

effectiveness  of  a test,  the  testedness  of  various  parts  of  a program,  detec- 
tion of  certain  anomalies,  generation  of  output  validation  tests  and  perfor- 
mance evaluation  and  optimization. 

Self-metric  software  was  first  investigated  and  implemented  in  the 

4 

SNUPER  Computer  by  Estrin,  et  al.  in  the  mid-60's.  In  this  system,  they 
used  hardware  monitoring  and  software  sampling  techniques  as  well  as  self- 
metric software  approach  to  measure  equipment  utilization,  instruction-type 
or  resident-routine-type  usage,  program  execution  activity  at  the  source 
statement  level  and  machine  language  level.  Recently,  self-metric  software 
has  been  used  extensively  in  automated  evaluation  and  validation  systems, 
such  as  PACE5,  PET1,2  and  JAVS6’7. 
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There  are  many  advantages  for  using  self-metric  software.  The 
implementation  of  software  probes  requires  little  support  or  modification 
of  the  host  computer  system,  and  because  software  probes  are  normally 
implemented  in  the  source  language,  the  implementation  can  be  quite  con- 
venient and  flexible.  There  are  primarily  two  types  of  self-metric  tech- 
niques: measuring  the  frequency  of  execution  and  data  values.  One  of 

the  most  important  measurements  is  the  program  frequency  path  frequency 
which  can  be  used  for  both  program  optimization  and  program  testing.  Some 

recent  results  in  optimal  placement  of  counters  for  measuring  program  path 

8 

frequencies  will  be  discussed. 

Self-metric  software  is  used  for  collecting  statistics  of  dynamic  pro- 
gram behavior  and  hence  plays  a passive  role  in  program  validation.  An 
extension  of  self-metric  software  is  to  incorporate  the  capability  of 
checking  the  program  dynamic  behavior  automatically  during  its  execution. 

9 

A piece  of  software  with  such  capability  is  called  self-checking  software. 
Although  self-checking  software  has  been  primarily  used  to  verify  the  cor- 
rect operation  of  a computer  system  during  execution  time,  especially  for 
those  real-time  systems  where  u 1 tra-reliabil ity  is  required,  it  plays  an 
active  role  in  program  validation  curing  software  development  by  isolating 
errors,  test  generation  and  validation. 

9 

There  are  also  many  advantages  lor  using  self-checking  software.  For 
instance,  it  can  be  implemented  at  various  program  language  levels  ranging 
from  design  language,  high-level  language,  machine  language  to  microprogram. 

It  checks  hardware  faults  in  addition  to  software  errors,  and  can  serve  as 
security  safeguards  against  malicious  intrusion.  It  can  be  applied  to  various 
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levels  of  system  design,  such  as  svstem  level,  module  level,  instruction 
level  and  data  level.  It  can  also  be  applied  to  various  stages  of  soft- 
ware development  ranging  from  design,  coding  to  testing.  There  are  three 

major  types  of  self-checking  techniques,  functional  checking,  control 

10 

sequence  checking  and  data  checking. 

Although  self-metric  and  self-checking  techniques  have  been  used 
extensively  in  certain  software  systems,  there  is  a great  need  for  develop- 
ing formal  methodology  for  design  and  generating  cost-effective  self-metric 
software  and  self-checking  software.  In  order  to  apply  self-metric  and 
self-checking  software  to  program  validation,  the  most  important  research 
needed  is  to  identify  what  information  on  tile  program  behavior  is  most 
useful  for  its  validation  and  also  easy  to  obtain. 
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Testing. 


To  INCREASE  OUR  CONFIDENCE  THAT  THE  PROGRAM  WILL  EXEC^E  THE  SPECIFIED 
FUNCTIONS  CORRECTLY  WITH  DESIRED  PERFORMANCE. 


Objectives  of  Effective  Testing 

1.  To  detect  as  many  errors  (functional  or  performance  deficiencies) 

AS  POSSIBLE. 

2.  To  detect  the  errors  as  early  as  possible  in  the  software  development 

CYCLE. 

(i)  SAVE  SOFTWARE  DEVELOPMENT  COST 
(i I ) EASIER  TO  CORRECT  AN  ERROR 

Better  programing  productivity  and  program  reliability. 

3.  To  develop  a comprehensive  set  of  acceptance  tests. 


* Complex  structure,.  and  many  instructions  with 

COMPLICATED  DATA  STRUCTURES. 

* Developed  by  a large  number  of  programmers, 

SOMETIMES  IN  DIFFERENT  LOCATIONS. 


# A SIGNIFICANT  NUMBER  OF  FLW  PATHS. 

# Individual  modules  are  usually  thoroughly  tested, 

BJT  NOT  TT)L  WHOLE  SYSTEM.  — PROGRAM  ERRORS  CAN 
USUALLY  EE  TRACED  HACK  TO  INCORRECT  INTERACTIONS 
BETWEEN  MODULES . 


Program  beta v; or  is  data-pepencent  and  fyv:Ro:;-EMTr 

DEPENDEMi  . 

Program  behavior  cannot  ee  reliably  predicted. 


* A large  amount  of  'dormant'  software  bugs  may  still 

BE  PRESENT  AFTER  THE  PROGRAM  IS  PUT  INTO  OPERATION 
A LONG  T I ME 

# P^QN'EfTT  MODI  FI  CAT  IONS  I 

• CORRECT  AN  ERROR  JUST  DISCOVERED, 

• SATISFY  NEW  SPECIFICATIONS  ArO  REOUIREMENTS, 

• If-TROVE  THE  EFFICIENCY  OF  THE  PROGRAM. 

• Each  modification  may  introduce  unexpected  software 

ERRORS. 


w 
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PROGRAMING  ENVIRONMENT 


RUM-TIME  ENVIRONMENT 


INPUT  DATA 

l! 


SPECIFICATIONS  RESULTS 

Computations — ^Programmer:  information  on  run-time  behavior. 

(self-metric) 

Programmer — ^Computations:  monitoring  of  the  computations 

PERFORMED.  (SELF-CHECKING) 
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Definition: 


Purposes: 


Application: 

Im^lefentation: 


SQHETMC  StEE-RBE 

A PIECE  OF  SELF-METRIC  SOFTWARE  IS  A PROGRAM  WHICH 
MEASURES  ITS  OWN  DYNAMIC  BEHAVIOR  AUTOMATICALLY  DURING 
ITS  EXECUTION, 

TO  COLLECT  STATISTICS  OF  RUN-TIME  BEHAVIOR  OF  PROGRAMS 
SUCH  AS 

FREQUENCY  OF  EXECUTION 
. DATA  VALUES 

* PROGRAM  OPTIMIZATION 

* PROGRAM  VALIDATION 

INSERTION  OF  SOFTWARE  PROBES  INTO  SOURCE  PROGRAM 
AFTER  AUTOMATED  STATIC  ANALYSIS 
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Software  probes  are  implemented  in  source  language 

CONVENIENT 

FLEXIBLE 

► 

High  degree  of  automation  possible 

► 

Optimization  possible  to  minimize  interference  to 

PROGRAM  OPERATION 

► 

Effect  of  interference  predictable 

► 

Easy  insertion  and  removal  of  software  probes 

Requires  little  support  or  modification  of  the  host 
computer  system 


TYPES  OF  SELF-f^ETRI  C-TECHNI QUES 


Frequency  of  Execution 

Program  Profile  - frequency  of  execution  of  every  statement 

*Data  Activity  Profile  - frequency  of  reference 

frequency  of  assignment 

frequency  of  a particular  operation  on  a 

PIECE  OF  DATA 

# 

Program  Path  Frequency  - frequency  of  activation  of  a control 

SEQUENCE 

Measurement  of  Data  Values 

« 

Program  Variables  - initial  value 

final  value 

MINIMUM  VALUE 
MAXIMUM  VALUE 
MEAN 

VARIANCE 

« 

Parameter  Values 
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APPLICATIONS  OF  PROGRAM  PATH  FREQUENCIES 


PROGRAM  OPTIMIZATION 

*NODE  FREQUENCY  MEASUREMENT  PINPOINTS  THE  SEGMENTS  OF  CODE  THAT  ARE 
EXECUTED  MOST  OFTEN. 

"PATH  FREQUENCY  MEASUREMENT  P IN PORTS  THE  PROCESSES  (CORRESPONDING  TO 
PATHS  IN  A PROGRAM)  THAT  ARE  EXECUTED  MOST  OFTEN. 

OPTIMIZATION  EFFORTS  SHOULD  CONCENTRATE  ON  THESE  PARTS. 

#PATH  FREQUENCY  MEASUREMENT  GIVES  US  THE  BRANCHING  CHARACTERISTICS 
FOR  EFFICIENT  LOOK-AHEAD  SCHEDULING,  ESPECIALLY  FOR  PIPELINED 
MACHINES  LIKE  THE  CDCooOO.  It  ALSO  HELPS  REGISTER  ALLOCATION  AND 
SUBROUTINE  OPTIMIZATION. 

‘DETECTION  OF  DEAD  CODE. 

* 

OPTIMIZATION  OF  THE  PLACEMENT  OF  FILES  IN  A SEQUENTIAL  MEMORY  STORAGE 

SN  I 

I jlU  i I 

PROGRAM  RESTRUCTURING  FOR  VIRTUAL  MEMORY  SYSTEM  WITH  PAGING, 

PROGRAM  IE3I1I& 

‘EVALUATE  THE  TESTEDNESS  OF  A PIECE  OF  SOFTWARE. 

‘DEAD  CODE  DETECTION 

‘ELIMINATE  REDUNDANT  TESTING  BY  PINPOINTING  PORTIONS  OF  THE  PROGRAM 
WHERE  ADDITIONAL  TESTS  ARE  NEEDED. 

‘HELPS  TO  DESIGN  EFFECT I'vt  TESTS  BY  PATH  SENSITIZATION. 
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DIRECTED  GRAPH  MODEL  OF  PROGRAM  STRUCTURE 

A PROGRAM  IS  REPRESENTED  BY  A DIRECTED  GRAPH  SUCH  THAT 

A MQD£  REPRESENTS  COMPUTATIONAL  TASK  (SEGMENT  OF  CODE) 
AN  ABC.  REPRESENTS  A POSSIBLE  TRANSFER  CF  CONTROL  FROM  A 
SEGMENT  OF  CODE  TO  ANOTHER . 


*A  E&IH  IS  A SEQUENCE  OF  ARCS, 

#THE  LENGTH  OF  A PATH  IS  THE  NUMBER  OF  ARCS  ON  THE  PATH. 

A NODE  IS  A PATH  OF  LENGTH  ZERO. 

AN  ARC  IS  A PATH  OF  LENGTH  ONE. 

^ SPANNING  TREE  OF  A DIRECTED  GRAPH  G IS  A TREE  (UNDIRECTED) 
WHICH  CONTAINS  ALL  THE  NODES  OF  G AND  IS  A SUBGRAPH  OF  G. 


I 


CONSERVATION  OF  FLOW  AT  A NODE 

DEFINITION:  F(P)  = FREQUENCY  OF  EXECUTION  OF  THE  nATH  P 

IN  THE  PROGRAM 
= FLOW  THROUGH  THE  PATH  P 
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THEOREM  1 


THE  KNOWLEDGE  OF  THE  FREQUENCIES  OF  EXECUTION  OF  ALL  THE 
DIRECTED  ARCS  NOT  IN  A SPANNING  TREE  OF  THE  PROGRAM  GRAPH 
ARE  BOTH  NECESSARY  AND  SUFFICIENT  TO  DETERMINE  THE 
FREQUENCIES  OF  EXECUTION  OF  ALL  THE  DIRECTED  ARCS  IN  THE 
PROGRAM  GRAPH. 

REMARKS: 

. IF  THERE  ARE  |a|  ARCS  AND  |n|  NODES  IN  A PROGRAM 
GRAPH,  A NON-REDUNDANT  SET  OF  COUNTERS  TO  MEASURE 
THE  FREQUENCIES  OF  EXECUTION  OF  ALL  THE  ARCS  WILL 
CONSIST  OF  |A|  - |n|  + 1 COUNTERS  PLACED  ON  THE  ARCS 
ON  THE  COMPLEMENT  OF  A SPANNING  TREE. 

. THEOREM  1 HOLDS  FOR  ANY  DIRECTED  GRAPH  SUCH  THAT 
CONSERVATION  OF  FLOW  IS  OBSERVED  AT  EVERY  NODE. 
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ALGORITHM  1 : OPTIMAL  MEASUREMENT  OF  ARC  FREQUENCIES 


1.  ASSIGN  A COST  Ci  TO  EVERY  ARC  IN  THE  GRAPH  AS  THE  COST 
OF  PLACING  A COUNTER  ON  THE  ARC  A. 

C.  = K.  + d f. 
l l i 

WHERE  K.  = COST  DUE  TO  STORAGE  AND  EXECUTION  TIME 
INTERFERENCE  BY  MONITORING  ARC  A. 
d = COST  OF  MEASUREMENT  OVERHEAD  FOR  EVERY 
ACTIVATION  OF  THE  COUNTER 
f.  = FREQUENCY  OF  ACTIVATION  OF  THE  COUNTER 
ON  ARC  A. 

2.  FIND  THE  MAXIMUM  (COST)  SPANNING  TREE  OF  THE  PROGRAM 
GRAPH  USING  THE  GREEDY  ALGORITHM. 


3.  PLACE  A COUNTER  ON  EACH  OF  THE  ARCS  NOT  ON  THE  MAXIMUM 
SPANNING  TREE. 


I 


AN  EXAMPLE  OF  OPTIMAL  ARC  FREQUENCY  MEASUREMENT 


COUNTERS  SHOULD  BE  PLACED  ON 
THE  ARCS  Br  B2>  C AND  E]  . 

F ( A 1 ) = F ( B 1 ) + F(B2) 

F(A2)  = F(C1) 

F ( D 1 ) = F(E])  - F(B,  ) 

F(D2)  = F ( B i ) + F(C])  - F(E])  + F ( B 2 ) 


I 
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I 

THEOREM  2 (PATH  FLOW  CONSERVATION  THEOREM) 

FOR  EVERY  PATH  P OF  LENGTH  n-1,  (N] , N2*  ....  N ),  FROM  NODE 
N1  TO  Nn  IN  A DIRECTED  GRAPH  G,  WE  CAN  WRITE  TWO  FLOW 
CONSERVATION  EQUATIONS  FOR  THE  FLOW  OF  ALL  THE  PATHS  OF 
LENGTH  n THAT  CONTAIN  P AS  A SUBPATH: 
r s 

H f(a]p)  = F(P)  = 2 F(PAj) 
j=l  k=1 

WHERE  A1.  (FOR  j = l,  ....  r)  ARE  THE  DIRECTED  ARCS  ENTERING  N. 

^ I 

AND  Aj  (FOR  k = l , ....  s)  ARE  THE  DIRECTED  ARCS  LEAVING  N . 
K n 


l 
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DEFINITION 


THE  ith  ORDER  GRAPH  OF  A DIRECTED  GRAPH  G,  G*.  IS  A DIRECTED 
GRAPH  THAT  REPRESENTS  THE  RELATIONSHIP  OF  THE  FLOW  OF  THE 
PATHS  OF  LENGTH  i IN  G WITH  FLOW  CONSERVATION  OBSERVED  AT 
EVERY  NODE.  THEREFORE  AN  ARC  IN  G*  REPRESENTS  A PATH  OF 
LENGTH  i AND  A NODE  A PATH  OF  LENGTH  i-1  IN  G. 

ALGORITHM  2:  CONSTRUCTION  OF  G*+]  FROM  G* 

1.  FOR  EVERY  ARC  A*1  IN  G*  CREATE  A NODE  N*1+1  TO  REPRESENT 

★ J 1 J 

IT  ^ Gi+1- 

2.  AN  ARC  (N*1+1 ,N*1+1 ) IS  CREATED  IN  G*  + ] IF  A*1  FOLLOWS 


I 
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ALGORITHM  3:  OPTIMAL  MEASUREMENT  OF  FREQUENCIES  OF  PATH  OF 
LENGTH  n OF  THE  PROGRAM  GRAPH  G 

1.  STARTING  FROM  THE  PROGRAM  GRAPH  G (=G*),  CONSTRUCT  THE 
nth  ORDER  GRAPH  OF  G,  G* , BY  REPEATEDLY  APPLYING 
ALGORITHM  2. 

2.  THE  OPTIMAL  MEASUREMENT  OF  FREQUENCIES  OF  PATH  OF  LENGTH 

n OF  THE  PROGRAM  GRAPH  G IS  EQUIVALENT  TO  THE  OPTIMAL 

MEASUREMENT  OF  ARC  FREQUENCIES  OF  THE  GRAPH  G*. 

n 

THEREFORE  ALGORITHM  1 CAN  BE  USED  TO  LOCATE  THE  OPTIMAL 
SITES  FOR  COUNTER  PLACEMENT. 
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AN  EXAMPLE  OF  OPTIMAL  PATH  FREQUENCY  MEASUREMENT  (OF  PATHS 

OF  LENGTH  2) 


COUNTERS  SHOULD  BE  PLACED  ON  THE  PATHS  , C^,  , 

b2d2,  and  d]e1. 


2SQ 


ALGORITHM  4:  CONSTRUCTION  OF  G.  -j  FROM  G. 

1.  FOR  EVERY  NODE  N*’lN  G*  CREATE  TWO  NODES  N*!'1  AND  N*^" 1 

J T J I J ^ 

IN  G*_,  WITH  AN  ARC  A*1-1  = (N*]'1,  N*^'1)  BETWEEN  THEM. 

1 “ J J I 3 ^ 

2.  FOR  EVERY  ARC  (N*1,  N*1 ) IN  G*  MERGE  THE  NODES  N*}*1 

r k l r2 

AND  N*!"1  TOGETHER  INTO  A NODE  IN  G*  . . DO  NOT 
kl  i-l 

ELIMINATE  ANY  ARC  b.  1 . 

ALGORITHM  5:  OPTIMAL  MEASUREMENT  OF  NODE  FREQUENCIES 

1.  APPLY  ALGORITHM  4 ON  THE  PROGRAM  GRAPH  G TO  FORM  G*. 

2.  THE  OPTIMAL  MEASUREMENT  OF  NODE  FREQUENCIES  OF  THE  PROGRAM 
GRAPH  G IS  EQUIVALENT  TO  THE  OPTIMAL  MEASUREMENT  OF  ARC 
FREQUENCIES  OF  THE  GRAPH  G*  . 

THERti-OkE  ALGORITHM  1 CAN  bt  APPLIED  TO  LOCATE  THE 
OPTIMAL  SITES  FOR  COUNTER  PLACEMENT. 


AN  EXAMPLE  OF  OPTIMAL  NODE  FREQUENCY  MEASUREMENT 


COUNTERS  SHOULO  BE  PLACED  ON  THE  NODES  A,  B,  D,  AND  E. 
F(C)  = F(A ) - F ( B ) 
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SELF-CHECKING  SCFTh'ARE 


Definition: 

Purposes: 

Application: 

Implementation: 


A PIECE  OF  SELF-CHECKING  SOFTWARE  IS  A PROGRAM  WHICH 
CHECKS  ITS  OWN  DYNAMIC  BEHAVIOR  AUTOMAT I CALLY  DURING 
ITS  EXECUTION. 

* TO  DETECT  SOFTWARE  ERRORS. 

* TO  ISOLATE  SOFTWARE  ERRORS. 

* TO  HELP  THE  SYSTEM  TO  RECOVER  FROM  ERRORS. 

* TO  VERIFY  THE  INTEGRITY  OF  THE  SYSTEM. 

* SELF-CHECKING  SOFTWARE  IS  PRIMARILY  DESIGNED  TO 

VERIFY  THE  CORRECT  OPERATION  OF  THE  SYSTEM  DURING 
EXECUTION  TIME,  ESPECIALLY  REAL-TIME  SYSTEMS  WHERE 
ULTRA-RELIABILITY  IS  PERU I RED. 

* It  IS  ALSO  USEFUL  IN  DETECTING  AND  ISOLATING  ERRORS 

DURING  SOFT,. 'ARE  DEVELOPMENT. 

1.  BY  INTRODUCING  SOFTWARE  REDUNDANCY  (INSTRUCTION 
REDUNDANCY  AND  DATA  REDUNDANCY)  INTO  THE  SOFTWARE 
SYSTEM . 

2.  By  BUILDING  SPECIALIZED  HARDWARE  TO  PERFORM  THE 
CHECKING. 


ADVANTAGES  OF  SELF-CHECKING  SOFTWARE 


* It  can  be  implemented  at  different  levels: 

DESIGN  LANGUAGE 
HIGH  LEVEL  LANGUAGE 
MACHINE  LANGUAGE 
MICROPROGRAM 

* Checking  can  be  performed  concurrently  with  program  computation. 

* It  checks  hardware  faults  in  addition  to  software  errors. 

* It  can  serve  as  security  safeguards  against  malicious 

intrusion  or  programmer  planted  bugs. 

* It  can  improve  the  availability  of  the  system. 


1 


254 


SYSTEM  DESIGN  WITH  SELF-CHECKING  SGFTl-WRE 

* Self-checking  softvare  can  be  applied  to 

DIFFERENT  LEVELS 

* SYSTEM  LEVEL 
' MODULE  LEVEL 

* INSTRUCTION  LEVEL 

* DATA  LEVEL 

* Self-checking  soft,. -are  can  be  applied  in 

DIFFERENT  STAGES  OF  SOFTWARE  DEVELOPMENT 

* DESIGN 

* CODING 

* TESTING 

* Development  of  specialized  hardware  to 

IMPLEMENT  EFFECTIVE  CHECKING  TECHNIQUES 

* hardware  timer 

* ASSOCIATIVE  PROCESSOR 
1 HARDWARE  WATCHDOG 
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TYPES  OF  SELF-CHECKING  TECHNIQUES 


Functional  Checking 
•Verify  reasonableness  of  output 

CHECK  IMPOSSIBLE  CONDITIONS 
CHECK  RELATIONSHIPS  OF  OUTPUT  VALUES 

'Verify  consistency  of  output 

SUBSTITUTION  OF  SOLUTIONS  INTO  EQUATIONS 

REVERSE  TRANSLATION 

VERIFY  CONSISTENCY  OF  RESOURCE  STATE 

► 

Control  Sequence  Checking 
'Infinite  loop 

MAXIMUM  BOUND  CHECKER 

’ Illegal  branch 

RELAY-01  !*!'.' in  SC 
POSITIVE  CHECK 

■Wrong  branch 

DECISION  LOGIC  IN  SUCCESSOR  MODULES 

•Incorrect  logf  term it>iat ion 

SIMILAR  TO  WRONG  BRANCH 

► 

Data  Checking 

•Integrity  of  data  values 
CHECKSUM 

COMPARISON  OF  STATE  OF  RESOURCE  TO  SYSTEM  TABUE 
f^MORY  PROTECTION  MECHANISMS 
COMPARISON  WITH  BACKUP  COPY 

•Integrity  of  data  structures 

LINKAGE  CHECKS  FOR  LINKED  LISTS 
INFORMATION  ON  STATE  OF  DATA 

•Nature  of  data  values 

DATA  FILTER  (MAXIMUM  AND  MINIMUM  BOUNDS) 
ASSERTIONS  (RELATIONSHIPS  between  data) 

?SG 


CONTROL  SEQUETJCE  CHECKING  - 


TO  DETECT  ALL  INFINITE  LOOPS 


1,  Generate  the  program  graph, 

2,  Determine  all  strongly  connected  regions  (cycles) 

3,  Determine  the  minimum  number  of  branches  that  nee 
in  order  to  eliminate  all  strongly  connected  regic 

A.  Insert  a maximum  eound  checker  on  each  of  these  bf 


*i1aximum  Bound  Checker 
loopI  <-loop1+1 

IF  (LOOPl.GT. BOUND  1)  GO  TO  ERROR 
LOOPl  IS  RESET  TO  ZERO  WHEN  CONTROL  IS  TRANSFERR 
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(CONTROL  SEQUENCE  CHECKING  - TO  DETECT  WRONG  BRANCHES) 


DATA 

MUTILATION' 


MEMORY 


% 

COMPUTATION 


% 

DECISION 


Mi 


«2 


CONVENTIONAL  SCHEME 
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CONTROL  SEQUENCE  CHECKING  - TO  DETECT  WRONG  BRANCHES 


r 


DATA  CHECKING-TO  CHECK  THE  INTEGRITY  OF  DATA  VALUES 

* Checksum 

' A CHECKSUM  IS  KEPT  WITH  A SEGMENT  OF  CODE. 

* THE  CHECKSUM  IS  CHECKED  BEFORE  THE  SEGMENT  IS  USED. 

’ RECOMPUTE  CHECKSUM  AFTER  LEGITIMATE  MODIFICATIONS. 

* Consistency  check 

' THE  STATE  OF  A RESOURCE  IS  STORED  IN  A REGISTER  WITH  THE 
RESOURCE. 

* COMPARE  THE  SYSTEM  TABLE  PERIODICALLY  WITH  THE  STATE  OF 
THE  RESOURCES. 

* Memory  protection  mechanisms 

* relocation  and  bound  registers  (CDC  6000  series). 

* LOCK  AND  KEY  (IBM  360  SERIFS). 

' ARRAY  BOUNDARY  CHECKS  (BURROUGHS  B5500) . 

' PAGING  (XDS  940). 

' SEGMENTATION  (HONEYWELL  645). 

* Backup  copy 

* A BACKUP  COPY  OF  INVARIANT  CODE  AND  DATA  IS  KEPT. 

* OPERATING  COPY  IS  COMPARED  PERIODICALLY  WITH 
BACKUP  COPY. 
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APPLICATIONS  IN  PROGRAM  VALIDATION 

Understanding  the  dynamic  behavior  of  the  program  helps  generation  of 

FURTHER  TESTS 

Self-metric  software  useful  for 

•Evaluation  of  effectiveness  of  a test 

•Evaluation  of  testedness  of  different  parts  of  a program 
(portions  of  program  that  are  executed  most  frequently 
and  have  the  most  extensive  self-g-ecicixo  capabilities 

ARE  MOST  THOROUGHLY  TESTED.) 

•Modification  of  existing  tests  to  exercise  untested  portions 
•Detection  of  anomalies 
•Generation  of  output  validation  tests 
•Performance  evaluation  and  optimization 

► 

Self-checking  software  useful  for 
•Test  generation  by  formulating  entrance  conditions  of  every 

SEGMENT  OF  CODE 

•Detection,  location  and  isolation  of  software  errors 
•Test  validation  by  inserting  assertions 
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BME  RESEARCH 

} 

Evaluate  the  cost-effectiveness  of  different  self-metric  and 

SELF-CHECKING  TECHNIQUES 

► 

Develop  formal  methodology  in  designing  self-metric  and  self- 
checking TECHNIQUES 

r 

Develop  effective  techniques  for  generating  self-fetric  and 
self-checking  software 

► 

Program  validation 

► 

Program  restructuring  for  reiiarility,  maintainability  and 
efficiency  improvements 
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Integrated  Software  Development  System 


Presented  by: 


Roy  W.  Coppinger 


Higher  Order  Software,  Inc. 
843  Massachusetts  Avenue 
Cambridge,  Mass.  02139 
(617)  661-8900 
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INTRODUCTION* 


The  Integrated  Software  Development  System  (ISDS)  is  defined 
to  be  a comprehensive  system  structure  that  includes  a basic 
set  of  principles,  a basic  set  of  tools  and  a basic  set  of 
standards  for  developing  computer-based  systems.  The  entire 
structure  of  ISDS  is  defined  in  terms  of  a common  formalized 
methodology  which  underlies  all  elements  of  the  system.1  This 
formalized  methodology  supports  the  specification  of  the 
complete  range  of  software  development  tools  utilized  within 
the  structure.  The  principles  of  ISDS  are  applied  to  all 
phases  of  system  development,  and  for  all  disciplines  including 
design,  verification,  documentation,  management,  and  maintenance. 
The  axiomatic  consistent  development  tools  of  ISDS  include  a 
specification  language,  a design  analyzer,  a structuring 
executive,  automated  documentation  tools  and  advanced  software 
management  techniques.  A specific  goal  of  TSDS  is  to  bring 
software  development  back  in  line  with  other  engineering 
disciplines  through  the  application  of  commonly  understood 
and  enforced  engineering  rules. 

The  system  can  be  conceptually  depicted  as  a management  system 
operating  on  a data  base  with  a sophisticated  set  of  tools 
(Figure  1) . The  development  data  base  contains  the  input  and 
outputs  of  each  ISDS  process.  The  data  management  system 
uses  the  appropriate  ISDS  tool  to  verify  the  quality  of  a 
component  of  the  data  base  or  to  analyze  the  interfaces  between 
components . 


FIGURE  1:  CONCEPTUAL  ISDS  SYSTEM 


•Excerpt  from  Roy  Coppinger's  ISUS/liOS  Conceptual  Description  Working 
Document,  Chapter  3,  Higher  Order  Software,  Inc.,  in  preparation,  June  1976. 
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Each  ISDS  process  will  have  identifiable  inputs  and  outputs. 
Initially,  the  functional  decomposition  process  has  as  input 
a set  of  requirements.  These  requirements  may  be  in  narrative 
form  and  can  bo  a collection  of  'desires'  or  a 'shopping  list'. 

The  output  is  a set  of  specifications.  The  specifications 
describe  the  input  requirements  as  functions  and  functional 
relationships  independent  from  a description  of  the  execution 
properties  of  the  system. ^ 

Formally,  this  process  is  aided  by  a specification  language. 

The  consistency  of  the  specifications  is  verified  by  a design 
analyzer  that  checks  the  syntax  of  the  specification  languague 
statements,  and  produces  a control  map  to  graphically  represent 
the  tree-like  structure  of  the  system. 

The  set  of  specifications  are  input  to  the  subsystem  allocation 
process.  The  outputs  are  the  specific  function  allocations  for 
each  part  of  the  system  (hardware,  firmware,  software).  Computer 
aided  design  tools  are  introduced  to  optimize  the  system 
structure.  Optimization  criteria  are  determined  by  the  particular 
system  constraint. 

The  set  of  functions  allocated  to  software  become  the  input  to 
the  software  development  process.  The  output  is  the  actual 
software.  (The  hardware  and  firmware  development s can  be 
performed  in  parallel.)  The  software  functions  are  written 
in  a Higher  Order  Language  (MOL) . The  HOL  analyzer  checks  the 
syntax  and  produces  a control  map.  Since  the  HOL  is  computer 
sequence  dependent,  the  structured  design  diagrammer  is  applied 
to  aid  the  software  engineer  in  adhering  to  the  rules  of 
structured  programming.  Figure  ? depicts  the  flow  through  uhe 
development  cycle. 
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MODIFICATION 
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/ A FLOWCHART  FOR  SOFTWARE  DEVELOPMENT 


BACKGROUND 


The  Integrated  Software  Development  System  is  based  on  the 
Higher  Order  Softwarer  (HOS)  methodology.  HOS  is  a formal  axiomatic 
system  design  methodology  based  on  six  axioms  controlling  inter- 
face correctness.  HOS  evolved  from  the  AF'OLLO  and  SKYLAB  projects. 
Figure  3 is  a graphical  representation  of  the  software  anomaly 
analysis  for  these  projects.1  Of  particular  interest  are  the  pre- 
flight anomalies,  44%  of  which  were  discovered  by  eyeballing  and 
73%  of  which  occured  at  software-software  interfaces.  The 
APOLLO  on-board  flight  software  effort  was  2000  man-years,  half 
of  which  was  spent  in  verification.  There  were  no  software 
errors  during  flight  but  the  achievement  of  this  reliability 
was  very  expensive.  Higher  Order  Software  was  developed  as  a 
formal  methodology  in  order  to  reduce  the  costs  of  developing 
such  highly  reliable  software  systems. 

Two  major  areas  of  concentration  of  the  HOS  methodology  are 
interface  correctness  and  static  verification  (automating  the 
eyeballing  process).  Tools  have  been  developed  to  automatically 
verify  interfaces  of  programs  without  executing  them.  Interface 
correctness  in  HOS  requires  adhering  to  six  axioms. 

ILLUSTRATIONS  OF  THE  AXIOMS  OF  ISPS * 

For  the  purpose  of  illustration  the  fictional  BRIGGEN  system 
will  be  used  as  a framework  for  explaining  the  axioms. 


♦Excerpted  from  M.  Hamilton,  S. , Zeldin,  ISDS/HOS  Conceptual 
Description  Working  Document,  Chapter  4,  Higher  Order  Software,  Tnc. 
June  1976.  Under  contract  to  U.S.  Army  Electronics  Command, 

CENTACS,  Ft.  Monmouth,  N.J. 
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A STUDY  OF  APOLLO  AND  SKYLAB  SOFTWARE 


ow 


2G8 


HR  T CCEN 


LT.  COL  1 I,T.  COL  2 


MAJ  ] MAJ  2 MAJ  3 


s \ 

CART  1 CART  2 


FIGURE  4 

BRIGGEN  Invocation  Tree* 

The  management  hierarchy  shown  in  Figure  4 is  often  referred 
to  as  a tree.  BRIGGEN,  COL  1 , and  CART  2 have  a particular 
position  of  responsibility.  Wo  often  refer  to  this  geographic 
position  as  a "node".  When  we  refer  to  COL  1 giving  orders  to 
LT . COL  1 and  LT.  COL  2,  COL  1 is  referred  to  as  a "controller 
or  as  a "module".  When  COL  1 receives  orders  from  BRIGGEN  he 
is  referred  to  as  a function.  A module  is  a supervisor;  a 
function  is  a subordinate.  A function  receives  its  input 
values  in  the  input  variables,  performs  its  operation,  and 
placed  the  output  values  in  the  output  variables. 


The  six  axioms  of  ISDS/HOS  are  explained  in  Figures  5 through 
10  using  the  BRIGGEN  system. 


An  invocation  tree  is  a representation  of  a control  map 
which  contains  only  function  names. 


MAJOR 


CONTRACTO 


Figure  J>  AXJQ/j (^lj  Invocation  Rights 


AXIOM  I 

Axiom  I is  illustrated  in  Figure  5.  The  LT  COLONEL  is  o 
on  REQUIREMENTS  to  produce  SPEC  IF  I CAT  IONS . 

SPliC  1 1;  I CAT  IONS  l.  l J.OI.ONI.I.  (t’l  (}ll  I KI.MI.N  I S ) 

If  the  MAJOR  and  CONTRACTOR  are  the  only  functions  tha 
to  a control  level  controlled  by  LT. COLONEL,  then  both 
and  CONTRACTOR  contribute  to  completing  the  LT. COLONEL 
w.  refer  t.o  thin  eontribut  ion  as  invocation.  Axiom  1 
• invocation  to  the  total  problem  defined  by  LT.COL< 
1 Nl  : oiitrrils  only  t li<>  invocation  of  MAJOR  and  < 
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Figure  _6^  AXIOM  (?)  Responsibility  Rights 


AXIOM  2 

In  Figure  6 LT. COLONEL  is  in  charge  of  taking  the  inpi 
values  of  REQUIREMENTS;  RFP , MEMO,  and  CONTRACT  and  p 
output  values  of  SPECIFICATIONS ; PROPOSAL  AND  PROTOTY 
refer  to  this  input/output  relationship  as  responsibi 
Axiom  2 relates  each  input  value  so  as  to  control  tha 
sibility  in  such  a way  that  only  one  output  value  is 
with  a particular  input  value  and  for  every  inpjt  val 
must  be  a corresponding  output  value. 
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F i gure  7 AXIOM ( 3 ) Output  Access  Rights 

AXIOM  2 

The  manner  in  which  a modulo  can  locate  a value*  of  its  input 
oi  output  variables  is  via  access  rights.  We  di.tinguish 
input  access  t ights  f i oin  output  access  rights.  Axiom  3 defines 
a module's  control  of  output  access  rights  for  the  functions 
i t cont  rols . 


In  Figure  7 LT. COLONEL  has  output  access  rights  to  the  variable 
SPECIFICATIONS , Wo  < • . i rt  locate  an  element  of  the  output  set  such 
as  PROPOSAL  or  PROTOTYPE  by  accessing  SPECIFICATIONS.  When  we 
refer  to  control  of  output  access,  we  imply  that  a variable  in 
the  set  that  is  under  output  access  rights  for  ET. COLONEL, 
appears  att  a variable  in  the  set  under  output  access  rights  for 


MAJOR  or  CONTRACTOR. 
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Figure  8 AXIOM  Input  Access  Rights 


AXIOM  4 

Axiom  4 defines  a modulo's  control  of  input  access  rights  for 
the  functions  it  controls.  In  Figure  8 LT. COLONEL  has  input 
access  rights  to  REQUIREMENTS.  We  can  locate  an  element  of  the 
i >put  set  of  LT. COLONEL  (such  as  RFP  or  CONTRACT)  via  these  rights. 

We  refer  to  control  of  input  access  when  we  imply  that  a variable 
in  the  set  that  is  under  input  access  rights  for  LT. COLONEL, 
also  appears  as  a variable  in  the  set  under  input  access  rights 
for  MAJOR  or  CONTRACTOR. 


273 


proposal 

prototype 

presentation 


SPECIFICATIONS 


LT.  COLONEL 


REQUIREMENTS 


Figure  9 


AXIOM  Cs)  Rejection 


Rights 


AXIOM  5 

Techniques  used  to  determine  if  input  values  are  valid  are 
controlled  by  the  module  itself.  This  means  that  if  LT.  COLONEL 
knows  he  can  accept  only  RFP  or  CONTRACT,  then  he  can 
reject  anything  else.  He  can  do  this  without  requesting  his 
subordinates  to  do  it  for  him.  This  type  of  cfontrol  (i.e., 
rejection  control),  limited  to  the  module  itself,  is  referred 
to  as  Axiom  5 and  is  illustrated  in  Figure  9. 


COf^KACT 


Co  y r^ACT oK 
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Rights 


AXIOM  6 

Axiom  6 relates  the  ordering  of  functions  with  respect  to 
their  controller.  Axiom  6 requires  that  LT.  COLONEL  be 
responsible  for  the  ordering  of  functions  MAJOR  and  CONTRACTOR. 
In  the  example  of  Figure  10,  since  PROTOTYPE  requires  inform- 
ation from  PROPOSAL,  LT.  COLONEL  would  insure  that  CONTRACTOR 
produced  PROPOSAL  before  MAJOR  was  invoked  to  produce  PROTOTYPE. 


TOOLS 


The  axioms  illustrate  HOS  handling  of  interface  correctness. 

The  Integrated  Software  Development  System  makes  use  of  these 
axioms  throughout  the  development  process.  The  tools  and 
techniques  of  development  and  verification  are  axomatically 
consistent.  Four  such  tools  used  by  ISDS  are  the  Specification 
Language-AXES , the  Analyzer,  the  Structured  Design  Diagrammer, 
and  the  Structuring  Executive. 


SPECIFICATION  LANUGUAGE* 

In  order  to  be  successful  in  describing  the  specification  of  a 
given  system,  we  need  to  consider:  first,  all  aspects  of 

reliablility ; and  second,  all  aspects  of  clarity.  A specifica- 
tion needs  to  be  reliable  in  order  to  ensure  that  it  is 
complete  and  consistent.  A specification  needs  to  be  clearly 
defined  in  order  that  it  can  be  properly  communicated  to 
managers,  users  and  designers. 

The  specification  language,  AXES,  is  a language  which  can  be 
used  by:  (1)  managers  to  convey  the  definition  of  a given 

system;  (2)  users  to  define  a specification  for  a given 
system;  (3)  designers  to  create  mechanisms  which  are  buliding 
blocks  for  a given  specification.  AXES  is  intended  to  provide 
the  necessary  mechanisms  for  the  specification  of  any  given 
system.  In  addition,  it  will  include  recommended  building 
blocks  for  the  specifications  of  representative  systems. 

From  the  language  mechanisms,  new  building  blocks  can  be 
defined.  In  addition,  other  new  building  blocks  can  be 
constructed  from  existing  building  blocks  for  individual  system 
needs.  From  these  building  blocks,  systems  can  be  described 
as  abstract  or  as  detailed  as  desired. 


•Excerpt  from  The  Foundations  for  AXES:  A Specification  Based 

Completeness  of  Control,  Charles  Stark  Draper  Laboratory,  Inc. 
March  1976.  Under  contract  to  U.S.  Navy,  Naval  Electronics 
Laboratory  Center,  San  Diego,  CA. 


To  date,  extensible  features  in  systems,  such  as  programming 
languages,  have  not  been  generally  successful4'5.  We  believe 
that  the  mechanisms  provided  in  the  specification  language  will 
facilitate  the  development  of  extensible  systems  as  first 
envisioned. 

AXES  will  provide  the  mechanisms  for  stating  the  properties  of 
a given  system.  Several  properties  of  specification  distinguish 
it  from  the  implementation  of  that  specification.  Many  times, 
when  designers  specify  a given  system,  they  assume  implicitly 
that  the  implementations  they  are  familiar  with  account  for 
those  properties.  For  example,  just  because  one  implementation 
system  provided  a lock  mechanism  for  shared  data  does  not  mean 
that  all  implementation  systems  will  do  so.  It  is  very 
difficult  for  designers  to  remove  these  biases,  especially 
when  the  assumptions  made  are  accounted  for  by  all  known 
implementation  systems  to  date.  We  should  consider,  however, 
that  some  implementation  features  that  are  common  today  may 
not  be  the  type  of  implementation  features  that  we  would 
envision  for  the  future.  More  significantly,  they  may  be 
restricting  us  in  the  design  process  more  than  we  realize. 

A given  specification  should  be  independent  of  the  particular 
"machine"  that  is  used  to  implement  that  specification.  That 
is,  the  specification  is  not  dependent  on  implementation 
tools  such  as  hardware,  software,  a human  operator,  compiled 
code  from  a higher  order  language,  an  interpreter,  or  a 
software  subroutine.  Each  specification  assumes  only  that 
there  is  a machine  mechanism  which  will  realize  the  properties 
of  that  specification.  This  distinction  in  design  is  important; 
for  once  a system  is  designed  dependent  on  implemenation  tools, 
the  independence  or  portability  of  that  specification  becomes 
1 i mited . 
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A given  specification  should  be  independent  of  the  particular  . 
system  design  concepts  that  are  assumed  in  the  implementation 
of  that  specification.  For  example,  for  a given  specification, 
functions  can  communicate,  in  concept,  cyclically  with  each 
other.  However,  the  implementation  of  that  specification  could 
assume  an  asynchronous  environment  whereas  another  implementation 
could  assume  a synchronous  environment.  It  is  advantageous  to 
define  a specification  independent  of  the  system  design  concepts 
of  the  implementation  in  order  that  a given  specification  could 
be  transferred  from  one  implementation  to  another  implementation. 

A given  specification  should  be  defined  with  unabiguous  symbols, 
words  or  combinations  thereof.  With  languages,  such  as  English, 
there  is  much  confusion  as  to  intent  or  meaning;  since  more 
than  one  meaning  can  often  be  attached  to  the  same  word  when 
used  in  one  particular  context;  or  the  same  word  can  often 
have  different  meanings  depending  on  the  context  in  which  it 
is  used.  (In  fact,  change  of  inflection  in  one's  voice  can 
change  the  meaning  of  words.)  It  is  entirely  possible  to  design 
a formal  language  in  which  the  phrases  look  like  English6. 

In  AXES,  we  can  show  the  relationship  between  the  formal  semantics 
of  an  abstract  control  structure  and  its  English-like  equivalent. 
But  we  must  also  consider  the  syntax  associated  with  that 
control  structure  to  be  unambiguous. 

A successful  specification  must,  therefore,  be  designed  in- 
dependent of  implicit  assumptions;  it  must  be  designed  independent 
of  its  implementation  tools;  it  must  be  designed  independent 
of  system  implementation  design  concepts;  it  must  convey  intent 
unambiguously;  and  it  must  provide  mechanisms  to  describe  the 
properties  of  systems  in  as  abstract  a manner  as  desired. 

AXES  will  provide  a reliable  means  in  which  to  define  a successful 
specification.  Reliability  is  to  be  obtained  from  the  properties 
of  systems  defined  by  Higher  Order  Software1.  These  properties 
are  embedded  in  the  specification  techniques  for  defining 


abstract  control  structures.  Each  abstract  control  structure 
uses  abstract  data  types  to  complete  a given  system  specification. 

With  AXES,  the  abstract  control  structures  can  be  defined  in 
such  a way  as  to  both  conform  to  the  formalism  required  for 
reliability  and  to  convey  intent  of  the  designer  to  managers 
and  users  of  a given  system  application. 


ANALYZER 

The  main  function  of  an  analyzer  is  to  guarantee  that  a given 
hierarchical  system  is  consistent  with  the  axioms.  There 
are  at  least  representations  of  a software  problem: 


HOL  Program 
Representation 
(compilable  source 
code) 

(2) 

An  analyzer  addresses  reliability  issues  in  the  first  two 
representations,  specifically,  "do  representations  1 and  2 
violate  the  six  HOS  axioms?"  and  "is  represention  2 in  compliance 
with  the  specification  in  1?". 

The  most  costly  errors  in  software  development  are  those  that 
occur  in  the  design  process  and  remain  undiscovered  until  system 
testing.  By  having  analyzers  available  for  the  specifications 
and  the  HOL  code  the  Integrated  Software  Development  System  can 
eliminate  a large  percentage  of  the  errors  before  the  code  is 
executed. 

A Parnas  type  specification  of  a Navy  Message  Processor  was 
manually  analyzed  revealing  29  errors  in  the  specification.  Twenty- 
seven  of  the  29  errors  were  found  to  be  interface  errors  that 
would  be  detected  by  the  analyzer7. 


Specification 

Language 

Representation 


0) 


Hxecutable  Code 
Representation 


(3) 
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STRUCTURED  DESIGN  DIAGRAMMER 


The  Structured  Design  Diagrammer8'9' 10(also  referred  to  as  the 
Automated  Flowcharter)  is  a computer  program  whose  function  is 
to  produce  documentation  from  source  statements  for  an  HOL. 

Up  to  date  flow  diagrams  can  be  produced  automatically  for 
each  revision  of  a program  module.  These  flow  diagrams  aid 
the  programmer/engineer  to  use  structured  programming  con- 
ventions by  providing  a visual  representation  of  the  block 
structure,  the  scope,  and  the  data  flow  inherent  in  the  program. 

A Design  Diagrammer  has  been  developed  and  implemented  for 
HAL,  a block  structured  Higher  Order  Language  (HOL) . This 
design  diagrammer  is  used  routinely  as  a production  tool  for 
software  development  for  the  SPACE  SHUTTLE.  Programmers  are 
able  to  obtain  this  output  by  simply  inserting  an  extra  job 
control  command  at  the  program  compilation  step.  Representative 
output  of  the  operational  structured  design  diagrammer  on  a 
line  printer  and  and  a CalComp  plotter  is  presented  in  Fiaures  11 
and  12. 

Demonstrated  advantages  of  the  existing  Design  Diagrammer 

include : 

1.  Automation  of  the  documentation  process. 

2.  Improved  management  control  of  software 
development . 

3.  Effective  enforcement  of  structured 
programming  techniques. 

Use  of  automated  documentation  in  the  SPACE  SHUTTLE  effort  has 
materially  reduced  the  cost  of  flow  diagram  preparation  and 
has  resulted  in  rapid  communication  of  up-to-date  documentation 
among  all  participating  organizations. 
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FIGURE  12:  EXAMPLE  OF  DESIGN  DIAGRAM  IN  PLOTTED  FORM 

(From  Space  Shuttle  Guidance  Routine) 
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Prior  to  the  automation  of  the  structured  design  diagrams, 
programmers  and  engineers  produced  design  diagrams  manually  for 
the  purpose  of  structuring  an  algorithm  before  it  was  coded. 

Now,  the  automatically  produced  design  diagram  is  used  as  a 
comparison  with  the  original  manually  produced  design  and  as  a 
means  of  producing  up-to-date  and  automatic  documentation  of 
the  computer  program. 

The  salient  features  of  the  structured  design  diaqrar$  are  more 
applicable  for  software  development  than  conventional  flow- 
diagrams.  Figure  13  illustrates  the  two  conventions  used  in 
generating  design  diagrams.  Referring  back  to  Figures  11  and  12. 
Lines  connect  nested  decision  levels  and  linear  execution  flow. 
Notice  that  after  each  logical  step  has  been  completed  control 
returns  to  the  main  flow  of  the  algorithm.  It  is  easier  to 
see  the  state  of  the  system  at  each  node.  The  nested  decision 
levels  are  clearly  distinguished.  Given  the  algorithm  in  this 
form,  the  equation  can  be  "eyeballed"  for  correct  formulation. 


Used  for  all  rc 1 it i ons 
and  Boolean  expressions. 


Used  for  invocations 
(explicit  and  implicit) 


FIGURE  13:  DESIGN  DIAGRAM  CONVENTIONS 


STRUCTURING  EXECUTIVE 

The  Structuring  Executive1  is  a lower  virtual  layer  module  with 
respect  to  a given  hierarchical  HOG  svstem  in  its  dyanmic  state. 
The  structuring  executive  is  used  to  implement,  in  real  time, 
multiprogramming  control  constructs.  The  structuring  executive: 

1)  controls  th  > ordering  of  those1  modules  which  can 
vary  in  real  time  dependent  on  operator  selection. 
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2)  assigns  priorities  to  processes  based  on  their 
relative  priority  relationships  (Axiom  6)  for 
each  control  level. 

3)  prevents  a violation  of  the  HOS  axioms  so  that  no 
two  processes  can  conflict  with  each  other. 

4)  determines  when  the  total  resources  of  the 
computer  are  approached. 

The  structuring  executive  is  the  dynamic  analog  to  the  analyze 
static  verification  of  interfaces. 


CONCLUSION 

The  lack  of  disciplined  development  system  is  the  major  con- 
tributing factor  to  the  Department  of  Defense's  problem  of 
producing  reliable  software11'12.  The  Integrated  Software 
Development  System  will  contain  the  tools,  techniques,  and 
principles  necessary  to  produce  reliable  software  at  a cost- 
effective  price. 
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VERIFICATION  AND  ERROR  DETECTION 
USING  A SYMBOLIC  EVALUATOR 


T.  E.  Chen  than,  Jr. 
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Center  for  Research  in  Computing  Technology 
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(Notes  on  a talk  delivered  on  5 August  1976  at  the  Invitational  DOD  Industry 
Conference  on  Software  Verification  and  Validation , Syracuse,  New  York.  ) 


1 . Overview 

- The  Harvard  Program  Manipulation  System  (PMS)  - a collection  of  tools 
for  aiding  in  the  development , validation,  and  maintenance  of  programs. 

- Some  remarks  on  the  programming  process. 

- The  El.l  Symbolic  Evaluator. 

- Application  of  the  Symbolic  Evaluator  to  program  verification  and  fault 
finding. 

2.  The  Harvard  Program  Mani pulation  System 

- A comprehensive  programming  system  with  a variety  of  tools  and  facilities 
for  aiding  in  all  aspects  of  the  process  of  program  development , validation, 
and  maintenance. 

- Some  tools  and  facilities: 

* The  ELI  programming  language. 

* Rewrite  facilities  and  editors. 

* Probe  and  monitor  facilities. 

* Program  checker. 

* Program  verifier. 

* Program  data  base  and  query  facility. 

* Symbolic  Evaluator. 

* Cost  expression  development  tools. 


* High  level  optimizers 


3.  Remember  the  Programming  Process 


The  "programming  process"  commences  with  some  "abstract  program"  - 

- sometimes  this  is  stated  mathematically,  e.g.  sorting,  parsing 
solving  linear  equations. 

- but  more  often  not 

payroll,  inventory,  MIS,  satellite  tracking,... 

But,  in  any  event,  the  abstract  program  is  "understood"  by  humans  to 
constitute  a specific  algorithm  at  some  level  of  abstraction. 

Given  the  initial  "abstract  program" , the  programming  process  involves 
making  choices: 

- division  of  the  program  in  to  modules , sub-modules , and  so  on 

- representation  of  data  objects 

- representation  of  operations  (transactions,  transformations,  etc.) 

This  may  (or  should)  be  done  at  several  levels  — consistent  with  the 
tenets  of  Structural  Programming. 

The  goals  of  the  choice  are 

1.  preservation  of  functionality 

2.  efficiency 

3.  preservation  of  clarity 

The  domain  of  choice  is  some  inventory  of  possibilities ; these  are  strongly 
dependent  upon 

- the  language  and  system,  and 

- the  experience  and  skill  of  the  programmer. 

The  bases  for  makign  a choice  are  facts  or  predicates  which  enable  a 
given  choice.  How  do  we  get  these  facts? 

- inherent  - i.e.  we  must  maintain  functionality 

~ deduced  - e.g.,  we  note  that  some  array  is  sorted  at  some  point 
and  take  advantage  of  that  fact 

- decided  arbitrarily  - e.g.,  we  decide  that  an  identifier  will 
be  less  than  30  characters  long,  or  we  decide  that  some  array 
will  be  sorted  at  some  point. 

Me  propose  a strong  need  for  dramatic  departure  from  the  current  programming 
practice  - that  is,  these  facts  are  every  bit  as  much  a part  of  the  find 
program  as  is,  for  example , some  load  accumulator  instruction  occuring 
in  a run  time  load  module. 
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A dual  problem  to  that  of  making  choices  is  that  of  optimization  of  some 
extant  program 

- doing  x in  a "better"  way 

- not  doing  some  thing  because  we  know  we  do  not  need  to 
These  are  the  "usual"  kinds  of  optimization 

- common  subexpression  elimination 

- register  allocation 

- retrieval  of  storage  from  "dead"  variables 

- and  so  on. 

There  are  also  a number  of  "high  level"  optimizations  which  might  be 
consi dered. 

- recursion  removal 

- loop  fusion 

- loop  cleavage 

- program  simplification 

- loop  reduction 

- and  so  on. 

But  again,  optimizing  transformations  are  enabled  by  facts  (predicates) 
about  the  program. 


The  above  requires  reasoning  about  a program  - and  that  in  turn  requires 
a calculus , a collection  of  mathematical  functions  and  rules  which  permit 
us  to  describe  and  reason  about  what  is  happening  in  the  execution  of  a 
program. 

- these  must  obviously  include  the  usual  arithmetic  and  logical 
operations 


- they  must  also  include  a logical  calculus  (e.g.,  the  first  order 
predicate  calculus) 


- they  may  also  include  such  mathematical  operations  as  finite  sums 
and  products,  functions  wh^ch  define  arrays,  and  list  structure 
(e.g.,  that  array  whose  j element  is  f(j),  and  the  least  j such 
that...,  and  so  on. 


The  ELI  Symbolic  Evaluator 


The  EL  Programming  Language  is  a complete  programming  language  whi-h  has 
been  used  in  the  development  of  several  systems  programs 

- integers , reals,  characters,  booleans. 

- homogeneous  arrays 

- records  (structures  or  non -homogeneous  arrays) 

- pointers 

- shared  values 

- block  structure  with  local  variables 

- loops 

- dynamic  scoping 

- and  so  on. 

Given  same  program,  the  goal  of  the  Symbolic  Evaluator  is  to  develop 
functions  which  describe  the  values  of  all  variables  in  all  context  - 
where  the  values  of  unknown  guantitatives  are  represented  symbolicallu 

The  SE  has  three  major  components 

- symbolic  execution 

- loop  analysis 

- simplification  (including  a theorem  prover) 

Symbolic  Execution 

The  symbolic  execution  of  some  program  (represented  as,  essentially,  list 
structure)  results  in 

- a "shadow"  of  the  program  - a representation  in  which  modes  of  all 
quantities  are  known,  all  implicit  computations  are  made  explicit 
(e.g.,  dereferecing,  the  "hidden"  computation  involving  loop  variables, 
and  so  on),  and  manifest  constants  are  propagated. 

a context  graph,  or  flow  graph  representing  the  possible  paths  of 
control . 

- anenvironment,  consisting  of  a “location"  for  each  distinct  variable 
occuring  in  the  program,  plus  a list  of  the  values  of  that  variable 
in  various  contexts  (represented  as  symbolic  expession,  in  general) 
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Loop  Analysis 


The  loop  analyzer  does  two  things 

- determines  the  value  of  each  variable  which  may  be  modified 
within  the  body  of  the  loop  as  a function  of  a symbolic  quantity, 
k,  which  denotes  the  cycle  index  (i.e.,  k would  be  one  on  the 
first  cycle;  two  on  the  second,  and  so  on). 

- solves  for  the  value  k , the  value  of  the  cycle  index  on  the  last 
cycle  taken,  in  general  a symbolic  expression. 

The  basic  method  is  as  follows:  We  postulate , for  each  quantity  which  may 

change  in  the  body  of  the  loop,  some  symbolic  value,  B,  as  its  value  at  the 
beginning  of  a general  (the  kth)  cycle.  We  then  do  a symbolic  execution  of 
the  loop  body.  Assuming  we  are  to  cycle  again,  a given  variable  then  has  some 
symbolic  value.  A,  at  the  end  of  the  kth  (beginning  of  the  k-Ust)  cycle. 

The  quantities  B and  A provide  a recurrence  relation  for  the  change  in  the 
variable  during  the  kth  cycle,  with  the  value  of  that  variable  immediately 
precceding  the  loop  as  boundary  condition. 

Solving  the  recurrence  relations  is  straight  forward  if  the  differnce  A-B 
happens  to  be  constant  or  a function  of  k. 

In  general,  of  course,  the  difference  A-B  is  conditional . If  this  is  the 
case  we  can  look  for  standard  kinds  of  conditionality  and  still  often  obtain 
a function  describing  the  value  of  the  variable  at  the  beginning  of  a general 
cycle 

- one  example  is  " first/rest"  behavior  which  happens  when  for  some 
variable,  the  loop  has  two  stable  states 

- another  example  is  "subset"  behavior  when  a variable  is  modified  on 
same  subset  of  the  cycles 

- and,  so  on. 

The  Important  point  is  that  we  can  study  conditionality  by  postulating,  for 
example,  first/rest  behavior,  and  then  verifying  that  this  is  or  is  not  the 
case  by  substitution  of  appropriate  values  into  this  symbolic  expression 
describing  the  values  of  variables  at  the  end  of  a general  cycle  in  order  to 
test  the  hypothesis. 

If  all  else  fails,  we  can  always  represent  the  value  of  a variable  by  a 
recursion  equation  - byt  even  these  are  subject  to  reasoning  analysi s , and 
si mpl i fi ca tion. 

The  number  of  cycles  taken,  k , is  determined  by  finding  for  each  exit 

L 

(say  the  ith)  that  predicate  involving  k which  enables  that  exit,  say  p^(k) . 

k is  then  the  least  j such  that  p^(j)  or  P2(j)  or...,  a function  which  often 

simplifies  to  a constant  or  symbolic  values  fe.gr.,  the  length  of  same  array 
in  same  dimension) . 
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Simplification 

A powerful  simplifier  is  required  to  keep  the  size  of  expressions  in  hand  and 
to  render  them  fit  for  human  consumption. 

At  present  the  boolean  expression  simplifier  includes  a complete  linear 
solver  for  systems  of  linear  inequalities . 

The  simplifier  is  divided  into  two  components,  one  is  a "black  box"  simplifier 
which  handles  standard  cases  (e.g.,  1+1=2,  TRUE  or  P=TRUE  etc.)  and  puts 
expressions  into  a normal  form.  The  other  is,  essentially,  a collection  of 
axioms  which  can  be  applied  to  an  expression.  New  axioms  are  readily  added 
by  the  user  in  a "human"  notation. 


4 . Applications  of  the  Symbolic  Evaluator 

Given  some  programs  P and  the  result  of  its  symbolic  evaluation,  we  envision 
several  applications  of  information  derived  about  the  program 

- high  level  transformations  to  improve  a program. 

- debugging  assistance 

- automatic  or  semi-automatic  selection  of  data  representations 

in  certain  studard  cases  (e.g.,  sets,  sparse  arrays,  and  the  like) 

- program  verification 

- mechanical  error  detection. 

We  look  at  the  latter  two  in  somewhat  more  detail. 


Verification 

The  use  of  a full  symbolic  evaluator  in  program  verification  often  precludes 
the  need  to  insert  inductive  assertions  inside  a loop  - the  expressions 
describing  the  behavior  of  a variable  on  a general  cycle  and  at  exit  are  often 
sufficient.  For  example  the  proof  that  the  usual  ged  algorithms  are  correct 
is  a simple  theorem  concerning  the  exit  values  of  affected  variables . 

The  "frame"  problem  is  often  significantly  allocated.  For  example  it  may  be 
manifest  that  the  value  of  some  quantity  does  not  change  or  that  the  values 
in  some  array  at  the  end  of  a loop  are  the  same  set  of  values  as  were  in  the 
array  at  the  beginning  - eliminating  the  need  to  set  up  and  prove  the  verifi- 
cation conditions  insuring  this  for  every  operation  within  the  loop. 

Mechanical  Error  Detection 

By  error  detection  wc  refer  to  insuring  that  a program  does  not  do  certain 
things  it  should  not  do  - independent  of  whether  it  does  what  it  should 
do.  Examples  are 


- subscript  range  verification 

- pointer  selection 
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- non-initialized  variables 

- excluding  division  by  zero 

- and  so  on. 


In  addition,  we  expect  to  extend  the  fault  finding  mechanisms  to  guarante « 
program  protection  and  security.  That  is,  if  a certain  class  of  transactions 
with  certain  variables  and/or  procedures  is  denied  the  techniques  of  fault 
finding  can  be  applied  to  guarantee  that  these  restrictions  are  not  violated. 


5.  Where  Are  We? 

At  the  present  time,  the  Symbolic  Evaluator  is  " limping"  - by  late  fall  it 
should  be  operational  but  lacking  in  certain  sophistication. 

We  have  commenced  studying  the  application  of  the  SE  to  program  verification 
and  fault  finding. 
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A SYSTEM  FOR  THE  VERIFICATION  OF  JOCIT  PROGRAMS 


Bernard  El  spas  and  Jay  M.  Spitzen 
Stanford  Research  Institute 
Menlo  Park,  California 


SUMMARY 


This  paper  describes  current  research  and  development**  in  the 
Computer  Science  Group  at  SRI  aimed  at  ultimately  producing  a programming 
environment  in  which  a JOVIAL  programmer  can  write,  debug,  and  prove 
correctness  for  his  programs.  The  first  phase  (RPE/1)  of  this  work  re- 
sulted in  an  experimental  system  at  SRI  capable  of  handling  verification 
of  small  program  modules  written  in  a subset  of  JOVIAL  (mainly  of  J3). 
The  current  (RPE/2)  phase  is  aimed  at  extending  the  capabilities  of  the 
original  system  to  handle  verification  for  programs  that  use  most  of  the 
the  features  of  JOVIAL  (J0CIT-J3).  Moreover,  the  system  is  to  be  trans- 
portable to  the  RADC-MULTICS  Honeywell  6l80  computer. 

Both  phases  of  this  work  have  been  based  on  the  technique  of 
specifying  program  behavior  by  inductive  assertions  and  entry/exit  asser- 
tions (the  Floyd  approach).  The  verification  systems  produce  verifica- 
tion conditions  (VCs)  automatically  for  any  legal  program  that  has  been 
properly  annotated  with  inductive  assertions.  Each  loop  of  the  program 
program  must  be  interdicted  by  at  least  one  such  assertion.  These  VCs  are 
mathematical-logical  formulas  which,  if  they  can  be  validated,  serve  to 
demonstrate  consistency  between  the  program  and  user-supplied  assertions. 
A user-guided  deductive  subsystem  of  the  verifier  is  used  to  generate 
formal  mathematical  proofs  of  the  VCs. 

The  front  end  of  the  verification  system  consists  of  (1)  a parser 
transducer  that  analyzes  a JOVIAL  program  file  and  transduces  it  to  tree- 
structured  form,  and  (2)  a verification  condition  generator  (VCG)  that 
that  produces  VCs  from  this  parse.  Thus,  the  front  end  can  be  regarded  as 
a kind  of  'compiler'  (a  verification  condition  compiler)  that  performs 
some  syntactic  checks  on  the  input  program,  but  which  outputs  VCs  instead 
of  compiled  code.  The  actual  semantic  checking  of  the  program  is,  of 
of  course,  carried  out  on  the  VCs  by  the  deductive  system.  The  VCG 
operates  by  computing  the  "weakest  precondition"  WP[S,Q]  for  each  seg- 
ment S of  the  program  as  determined  by  the  semantics  of  JOVIAL  and  the 
assertion  Q affixed  to  the  end  of  the  segment.  Thus,  WP[S,Q]  is  a formula 
in  predicate  calculus,  which  if  it  is  valid  on  entry  to  the  program  seg- 
ment S,  requires  that  Q be  valid  if  execution  reaches  Q.  In  Hoare's 
notation : 

WPtS.OHSIQ  holds. 


••Under  Contract  F30602-76-C-0201!  with  Rome  Air  Development  Center. 
The  first  phase  (also  for  RADC ) was  conducted  under  Contract 
F30602-75-C-00A2. 
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Moreover,  WP[S,0]  is  the  weakest  such  formula,  i.e.,  it  is  implied  by  any 
formula  P with  the  property  PfSIQ.  The  semantics  of  JOVIAL  are  incorpo- 
rated in  (Hoare-type)  inference  rules  --  one  for  each  JOVIAL  construct  -- 
that  specify  the  mapping  WP:  S x Q Q'.  For  example,  if  S is  a 
JOCIT  simple  variable  assignment,  var  a expr  $,  the  corresponding  trans- 
duced form  is  (:=  var  expr),  and  WP[S,Q)  is  given  by: 

(SUBST  expr  var  0) 

where  SUBST  denotes  the  substitution  of  (a  free  occurrence  of)  expr  for 
each  (free)  occurrence  of  var  in  0.  Similar  formulas  apply  for  each  of 
the  JOVIAL  statement  constructs.  When  the  VCG  encounters  an  assertion  at 
the  head  of  a segment  S in  this  process  of  working  backwards  through  the 
program,  VCG  appends  the  new  verification  condition  for  that  segment  to 
its  list  of  previously  collected  VC3.  Finally,  when  the  input  assertion 
at  the  head  of  the  whole  program  is  reached,  VCG  constructs  an  overall 
implication  of  the  form: 

< input -asse rt ion>  = = > < pr ev iously-con st r uc t ed - for mul as> 
and  outputs  this  formula. 

The  VCG  also  analyzes  all  of  the  declarations  in  the  (transduced) 
subject  program  and  creates  subsidiary  assertions  from  these  declarations 
In  particular,  this  permits  VCG  to  carry  out  a hierarchical  treatment  of 
of  subprocedures  and  functions  invoked  within  a JOVIAL  program,  so  that 
that  the  proof  of  correctness  of  the  main  program  can  be  decoupled  from 
the  proofs  for  its  submodules.  The  proof  partitioning  process  is  an  imp- 
ortant aspect  of  program  verification  as  carried  out  by  our  system,  since 
it  makes  manageable  the  complexity  associated  with  larger  programs,  pro- 
vided they  are  written  in  structured  form.  It  has  the  benefit  that  sub- 
procedures do  not  have  to  be  reanalyzed  (and  proved  correct)  for  each  in- 
vocation in  the  main  program.  However,  the  human  verifier  must  supply 
entry-  and  exit  assertions  that  adequately  characterize  each  such  subpro- 
gram. 


The  heart  of  this  program  verifier,  as  with  other  such  systems, 
is  its  deductive  component,  or  'theorem  prover'  . The  purpose  of  the  theo- 
rem prover  is  to  attempt  proofs  of  validity  for  the  matheraat ical - logical 
formulas  (VCs)  constructed  by  the  VCG.  It  should  be  noted  that  the  VCG 
eliminates  all  of  the  details  of  control  semantics  of  the  subject  JOVIAL 
program  in  the  process  of  constructing  the  VCs.  The  VCs  are  formulas  (in 
first-order  predicate  calculus)  which  a human  mathematician  could  usually 
decide  as  to  validity  without  any  knowledge  of  JOVIAL  programming,  if  he 
were  tireless  enough.  Most  of  the  facts  needed  in  proving  validity  of 
VCs  are  purely  mathematical  ones,  residing  in  the  domains  of  the  algebra 
of  integers  and  real  numbers,  and  of  propositional  logic.  Such  facts  as 
commutative,  associative  and  distribu;iv»  laws,  the  special  properties  of 
0 and  1,  the  properties  of  equality-  and  inequality  relations,  and  the 
definitions  of  a few  other  important  primitive  functions  (exponentiation, 
log,  antilog  and  square  root,  for  example)  are  built  into  the  deductive 
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system.  A deductive  subsystem  called  the  'expression  simplifier'  plays  a 
major  role  in  such  simple  deductions.  The  deductive  system  is  also  in- 
formed about  methods  for  handling  proofs  of  logical  formulas  built  up 
from  the  connectives,  AND,  OR,  NOT,  IF,  IMPLIES,  and  the  logical  quanti- 
fiers, "forall"  and  "exists",  although  the  latter  play  a relatively  minor 
role.  The  logical  formula  domain  of  the  VCs  is  here  referred  to  as  the 
(abstract)  'assertion  language'  of  the  verifier.  The  term  'concrete' 
will  be  used  to  refer  to  expressions  in  their  original  untransduced  form. 
Any  assertion  language  formula  can  legally  appear  as  an  entry/exit  asser- 
tion or  as  an  inductive  assertion  in  a subject  program.  The  (concrete) 
assertion  language  contains  all  legal,  executable  Boolean  formulas  of  JO- 
VIAL, and  also  allows  logically  quantified  Boolean  formulas  and  certain 
other  modes  of  expression. 

The  Deductive  System  comprises  several  distinct  layers  with  diff- 
erent capabilities  and  modes  of  user  interaction.  A Proof  Supervisory 
Executive  subsystem  permits  the  user  to  invoke  deduction  on  any  selected 
VC  (or  portion  thereof).  This  Proof  Supervisor  is  an  implementation  of 
the  Smullyan  method  of  analytic  tableaux.  It  first  attempts  purely  auto- 
matic deduction,  and  often  this  is  all  that  is  needed.  When  automatic 
deduction  is  unsuccessful,  the  tableaux  system  asks  the  user  for  guidance, 
while  displaying  and  saving  the  state  of  the  partially  successful  p-oof . 
The  user  can  then  invoke  additional  information  in  the  form  of  lemmas, 
definitions,  or  previously  proved  results,  and  this  information  becomes 
part  of  the  hypothesis  of  the  formula  to  be  proved.  The  tableaux  system 
also  permits  the  user  to  invoke  special  'black  boxes'  for  certain  kinds 
of  deductions.  One  such  special  deduction  subsystem  is  a decision  mecha- 
nism for  (an  extension  of)  Presburger  arithmetic.  This  'Presburger  box' 
can  decide  validity  of  quant i f ier - free  formulas  in  Presburger  arithmetic 
extended  by  propositional  (Boolean)  variables  and  function  symbols  with 
equality.  This  domain  is  a decidable  one:  valid  formulas  within  it  are 

certified  as  VALID  by  the  decision  box,  while  invalid  formulas  are  dis- 
proved by  finding  and  exhibiting  a counterexample.  Extension  of  the  sys- 
tem to  handle  wider  domains  is  now  under  consideration,  but  will  probably 
result  in  loss  of  full  decidability.  However,  even  a sera  id ec is  ion  algori- 
thm would  be  of  considerable  utility,  though  this  means  that  the  system 
would  forfeit  its  ability  always  to  discover  counterexamples  for  invalid 
formulas.  It  should  be  emphasized  that  when  the  user  invokes  special  de- 
duction techniques  or  auxiliary  Information  to  arrive  at  a tableaux  proof, 
the  Proof  Supervisor  keeps  a record  of  what  additional  information  was 
used,  and  reports  that  the  proof  of  correctness  was  contingent  on  that 
auxiliary  information  or  procedures. 

In  the  RPE/ 1 JOVIAL  Verification  System  only  a subset  of  JOVIAL 
constructs  were  accomodated.  These  comprised  assignment  to  simple  vari- 
ables and  to  arrays:  compound,  conditional  and  alternative  statements:  an 
iterative  statement  (this  last  following  J73  conventions):  and  procedure 

calls.  This  RPE/1  subset  was  therefore  a structured  subset  of  JOVIAL. 
In  the  current  RPE/2  phase  we  plan  to  accomodate  a much  wider  subset  of 
JOVIAL  (J3-J0CIT).  All  but  certain  machine-dependent  constructs  are 
handled  by  the  front  end  parser-transducer.  The  rest  of  the  system  will 
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accomodate  all  data-  and  procedure  declarations,  file-  and  table  I/O,  the 
'exchange',  'alternative',  'switch'  and  'goto'  statements  (among  others), 
in  addition  to  the  features  that  were  handled  by  the  RPE/1  system.  We 
expect  that  the  system  to  be  delivered  will  be  a semi-practical  verifica- 
tion tool  for  skilled  programmers.  Further  work  will  be  required,  however, 
to  make  it  usable  by  non-specialists  in  program  verification  techniques. 

The  RPE/1  verification  system  and  the  partially  completed  RPE/2 
system  presently  reside  in  files  at  the  SRI-AI  (ArpaNet)  DEC-10  computer. 
They  are  implemented  in  INTEPLISP.  The  RPE/1  system  is  being  translated 
into  Maclisp,  and  will  be  transported  to  the  RADC-MULTICS  Honeywell  6180 
computer  facility  as  part  of  the  project  work  of  the  current  phase. 
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APPENDIX  A --  JOCIT  PA RS E R -T R A NSDUC ER 


A Sample  Dialog  with  the  Interactive 
JOCIT  Parser 

The  JOCIT  program  shown  immediately  below  contains  syntax  errors  at 
the  lines  marked  by  asterisks  (*).  This  faulty  program  resides  in  a 
TENEX  file  called  TP. ERRORS.  We  will  show  below  the  actual  text  of 
a dialog  with  the  interactive  (error-correcting)  parser/ transducer 
developed  by  Boyer  and  Moore. 

[The  JOCIT  program:] 

START  PROC  TPROC ( AA , BB  ~ZZ)  t 
ASSERTIN  BB  GO  0 $ 

ASSERTOUT  ZZ  EO  A A • BB  $ 

ITEM  A A I 20  S $ 

ITEM  BB  I 20  U $ 

ITEM  ZZ  I 20  $ 

ITEM  II  I 20  P $ 

BEGIN 

ZZ  = 0 $ 

I 1 r 0 t 

LL.  ASSERT  ZZ  EO  A A • I 1 $ 

IF  II  NQ  BB  $ 

ZZ  = ZZ  + A A $ 

GOTO  LL 

END 

TERM  % 

[End  of  the  JOCIT  program] 

[The  next  section  displays  interaction  with  the  parser/ transducer  carried 
out  in  Lisp.  As  above,  comments  that  were  typed  in  later  are  enclosed  in 
brackets  [].  Except  for  the  user  responses  (noted  marginally  as  such) 
all  other  text  was  supplied  by  the  system.] 

U J { TP . ERRORS  ) [User  asks  for  parse  of  the  file] 

Illegal  lexeme  $ (of  type  $)  in 

I 20  S $ 

ITEM  BB  I 20  U $ 

ITEM  ZZ  I 20  $! 

ITEM  II  I 20  P $ 

BEGIN 

ZZ  s 0 


[Diagnostics:  the  J indicates  that 
the  preceding  lexem'  was  illegal] 


•[type  designator  S/U  missing] 
•[the  P is  illegal  here] 


•[terminator  $ is  missing] 
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l 

A lexeme  of  one  of  the  following  types 
S U 

What  should  be  done?  Replace  with:  " 


Illegal  lexeme  P (of  type  P)  in 
I 20  U $ 

ITEM  ZZ  I 20  S $ 

ITEM  II  I 20  P!  $ 

BEGIN 

ZZ  = 0 $ 

11*0$ 

A lexeme  of  one  of  the  following  types 
S U 

What  should  be  done?  Replace  with:  " 


Illegal  lexeme  END  (of  type  END)  in 

$ 

ZZ  * ZZ+AA  $ 

GOTO  LL 

END! 

TERM  $ 


was  expected: 

[More  system-supplied 
d iagnost ic  s ] 

$" 

[User  uses  R(eplace)  option:  only 
R and  the  string  " S $"  were  typed 
by  the  user.  He  could  also  have 
typed  T(eco)  to  invoke  a text  edi- 
tor, or  typed  A to  abort  parsing.] 

[The  parse  continues] 


[The  next  bug  is  discovered  by 
the  parser] 


was  expected: 


[User  opts  for  Replace  option: 
the  parse  continues  after  fix] 

[Next  error  is  found] 


[The  ! indicates  point  of  fault] 


A lexeme  of  one  of  the  following  types  was  expected: 

) $ , LPARVDOLL  [The  diagnostics  tell  user  that 

a $ was  omitted] 

What  should  be  done?  Replace  with:  "$  END" 

[User  types  R and  replacement 
string] 

[At  this  point  the  parse  concludes  successfully,  with  printout  of  the 
transduction.  We  have  abridged  it  here,  in  order  to  save  space  in  favor 
of  the  indented  printout  via  PRINTDEF  which  follows]. 


(SUBPROGRAM  TPROG  ((AA  BB)  (0UT:PARS  ZZ))  (ASSERTIN  (GTQ  BB  ( 
DECIMALXINTEGER  "0")))  (ASSERTOUT  (E0  ZZ  (TIMES  AA  BB)))  (ITEM  AA  ... 
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45_(  PRINTDEF  (VALUEOF  44))  [This  lined  typed  by  user] 

(SUBPROGRAM  TPROC  ( ( AA  BB) 

(OUT:PARS  ZZ ) ) 

(ASSERTIN  ( GTQ  BB  (DECIMALMNTEGER  "0"))) 

(ASSERTOUT  (EO  ZZ  (TIMES  AA  BB))) 

(ITEM  AA  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  BB  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  ZZ  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  II  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(BEGIN  (:=  ZZ  (DECIMALMNTEGER  "0")) 

(:=  II  (DECIMALMNTEGER  "0")) 

(LABEL  LL) 

(ASSERT  (EO  ZZ  (TIMES  AA  ID)) 

(IF  (NEO  I 1 BB) 

( : = ZZ  ( PLUS  ZZ  AA ) ) ) 

(GOTO  LL  NIL ) ) )NIL  [The  final  "NIL"  here  is 

an  artifact  of  PRINTDEF 
and  is  not  part  of  the 
transduction] 

[Next  the  user  writes  the  transduction  out  to  a new  TENEX  file  by  using 
the  INTERLISP  function  WRITEF ILE [ ex pr  ; filename]. 

46_( WRITEFILE  (LIST  (VALUEOF  4* ) ) ' TP . COR R ECT/ P A RSE  ] 

< ELS  P ASM  P.  CORRECT/ PARSE  :1 

[END  OF  DIALOG] 

[We  display  that  whole  file  below]: 

(PRIN1  ( OUOTE  " 

WRITEFILE  OF  <ELSPAS>TP . CORRECT/ PARSE : 1 MADE  BY  ELSPAS  ON 
27-JUL-76  16:25:15 
" ) T ) 
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(SUBPROGRAM  TPROG  ( ( AA  BB) 

( OUT  : PA  RS  ZZ)) 

(ASSERTIN  ( GTO  BB  ( DEC IM AL M NTEGE R "0"))) 

(ASSERTOUT  (EQ  ZZ  (TIMES  AA  BB))) 

(ITEM  AA  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  BB  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  ZZ  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  II  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(BEGIN  (:=  ZZ  (DECIMALMNTEGER  "0")) 

(:s  II  (DECIMALMNTEGER  "0")) 

(LABEL  LL) 

(ASSERT  (EQ  ZZ  (TIMES  AA  ID)) 

(IF  ( NEQ  II  BB) 

( : * ZZ  (PLUS  ZZ  AA  ) ) ) 

(GOTO  LL  NIL))) 

STOP  [STOP  is  the  EOF  marker] 


Now,  as  the  reader  may  have  noticed,  the  above  program  contains 
serious  semantic  errors  in  addition  to  the  syntax  errors  already  repaired. 

Of  course,  the  par ser/ transducer  is  incapable  of  discovering  semantic  errors 
since  these  are  connected  with  the  programmers  intentions  in  writing  the 
program.  These  intentions  are,  however,  embodied  in  the  input-,  output-, 
and  inductive  assertions  embedded  in  the  program.  In  particular,  the 
relation  ZZ  = AA#I1  is  supposed  to  be  maintained  as  control  passes  around 
the  loop.  That  is  what  is  meant  by  calling  that  relation  a 'loop  invariant'. 
Since  ZZ  is  altered  while  II  and  AA  are  unchanged  around  the  loop, 
the  above  relation  is  not  maintained  invariant.  It  happens  that  this 
inductive  assertion  is  true  on  first  entry  to  the  labelled  point  LL,  but 
the  relation  woould  be  destroyed  after  the  assignment  ZZ  :=  ZZ+AA,  unless 
AA  happened  to  be  0. 

There  is  another  semantic  error.  The  program  will  never  terminate 
since  control  always  passes  back  to  the  label  LL,  regardless  of  whether 
II  EQ  BB  or  not.  The  programmer  intended,  of  course,  to  use  II  NO  BB  as 
as  a test  for  loop  exit,  as  well  as  for  determining  whether  II  and  ZZ  are 
to  be  incremented.  The  control  structure  of  the  IF  statement  and  the 
GOTO  is  quite  incorrect  with  respect  to  the  programmer's  intentions. 

In  the  next  Appendix  we  show  how  these  semantic  errors  can  be  re- 
vealed by  subsequent  stages  in  the  verification  process,  in  particular  by 
generating  verification  conditions  for  the  (transduced)  program. 
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Detection  and  Correction  of  Semantic  Errors  Through 
Verification  Conditions 

Here  we  illustrate  how  the  generation  and  inspection  of  verification 
conditions  (VCs)  aids  in  semantic  debugging  of  a JOCIT  program.  It  should 
be  emphasized  that  the  system  we  are  using  here  is  in  a rather  incomplete 
state.  As  a result  the  detailed  form  of  the  dialog  illustrated  is  quite 
different  — in  particular  regarding  user-interaction  features — from  either 
the  existing  RPE/1  verifier  or  the  RPE/2  verifier  when  completed.  The  RPE/1 
system  contained  an  interactive  deductive  system  (not  used  here)  that  is 
capable  of  proving  the  mathematical  correctness  of  the  appropriate  VCs. 

[A  mechanical  proof  of  correctness  of  VCs  in  a slightly  different  form  that 
was  carried  out  by  the  RPE/1  deductive  system  is  shown  in  Appendix  D]. 

Here  we  content  ourselves  with  human  inspection  of  the  VCs  to  discover 
that  they  are  invalid,  then  we  repair  the  program  and  regenerate  VCs  that 
are  provable  manually. 

He  redisplay  the  transduced  program  (containing  semantic  errors 
as  mentioned  earlier): 

36_PP  TPROG 

(SUBPROGRAM  TPROG  ( ( AA  BB) 

(OUT: PARS  ID) 

(ASSERTIN  (GTQ  BB  (DECIMALAINTEGER  "0"))) 

(ASSERTOUT  (EO  ZZ  (TIMES  A A BB))) 

(ITEM  AA  (I  (DECIMALS INTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  BB  (I  (DECIMALS INTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  ZZ  (I  (DECIMALSINTEGER  "20") 

3 NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  II  (I  (DECIMALSINTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(BEGIN  (:=  ZZ  (DECIMALSINTEGER  "0")) 

(:s  II  (DECIMALSINTEGER  "0")) 

(LABEL  LL) 

(ASSERT  (EQ  ZZ  (TIMES  AA  ID)) 

(IF  (NEO  II  BB) 

(:*  ZZ  (PLUS  ZZ  AA))) 

(GOTO  LL  NIL))) 

TPROG 


[Next,  the  user  invokes  the  verification  condition  generator 
on  TPROG,  using  a function  appropriate  to  subprograms.] 
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37_(  VCG: SUBPROGRAM  TPROG) 

TPROG 

[This  function  returns  the  subprogram's  name  when  VCG  has  finished, 
but  it  also  sets  a new  variable,  TPROG#,  to  the  list  of  VCs.  Then  the  user 
asks  for  an  indented  display  of  these  VCs.] 

38_PP  TPROG# 

[[IMPLIES  (AND  (GTO  BB  (DECIMALMNTEGER  "0")) 

(ITEM  AA  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  BB  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  ZZ  (I  (DECIMALMNTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  II  (I  (DECIMALMNTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL)) 

(EO  (DECIMALMNTEGER  "0") 

(TIMES  AA  (DECIMALMNTEGER  "0"] 

(IMPLIES  (EO  ZZ  (TIMES  AA  ID) 

(IF  (NEO  II  BB) 

(EQ  (PLUS  ZZ  AA) 

(TIMES  AA  ID) 

(EO  ZZ  (TIMES  AA  II] 

TPROG# 

[There  are  two  VCs,  the  first  expressing  the  validity  of  the 
inductive  assertion  when  it  is  first  encountered — essentially  that 
0 s AA*0  — given  the  various  hypotheses.  The  hypotheses  consist  of 
the  collected  item  declarations  together  with  the  entry  assertion. 

In  this  case  the  hypotheses  are  irrelevant  to  the  conclusion,  though 
that  would  usually  not  be  the  case.  We  anticipate  that  in  future 
versions  of  the  verifier  the  item  declarations  will  be  processed  by 
the  VCG  into  a more  mathematical  assertion  language.  The  second  VC 
expresses  (a)  the  invariance  of  the  inductive  relation  around  the  loop 
and  (b)  the  implication  relating  that  inductive  assertion  to  the  exit 
assertion  on  loop  exit.  The  former  (a)  is  mathematically  false,  while 
(b),  the  latter — although  true--is  misleading  since  the  loop  cannot  exit. 
This  .s  a fundamental  deficiency  of  partial  correctness  proofs.  We  would 
have  to  supply  a more  complex  inductive  assertion  here  to  (attempt  to) 
prove  termination,  and  then  discover  the  termination  error.  This  could  be 
done  here  by  using  II  as  a loop  counter,  and  adding  a clause  (LT0  II  BB) 
to  the  inductive  assertion.  The  proof  that  this  clause  is  invariant 
would  effectively  bound  the  number  of  loop  executions  above  by  BB,  thus 
demonstrating  termination.  Note  however  that  the  failure  to  increment 
II  would  first  have  to  be  repaired  before  the  looping  error  could  be 
detected]. 

[The  user  next  edits  the  transduced  program.  This  is  not  a recommended 
procedure:  he  should  edit  the  original  J3CIT  program  and  then  reparse  it. 
The  shortcut  was  used  here  to  save  space.] 


j 
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39_£DIT (TPROG)  [User  invokes  Lisp  editor  on  TPROG] 

edit 

25#F  IF  PP  [Search  for  the  IF  statement] 

(IF  (NEQ  II  BB) 

(:«  ZZ  (PLUS  ZZ  AA))) 

26*3  MBD  BEGIN]  [Embeds  the  assignment  in  a BEGIN,] 

PP 

(BEGIN  (:r  ZZ  (PLUS  ZZ  AA})) 

27*(N  (:;  II  (PLUS  II  (DECIMALNINTEGER  "1")))(GOTO  LL  NIL)) 

[and  appends  the  missing  statements] 

0 PP  [Here  U3er  asks  to  see  the  edited 

(IF  (NEQ  II  BB)  IF  clause] 

(BEGIN  (:=  ZZ  (PLUS  ZZ  AA)) 

(:=  II  (PLUS  II  (DECIMALNINTEGER  "1"))) 

(GOTO  LL  NIL))) 

28*0  PP 

(BEGIN  (:=  ZZ  (DECIMALNINTEGER  "0"))  [User  instructs  editor  to  go  up  one 
(: = II  (DECIMALNINTEGER  "0"))  level  and  print  out  the  BEGIN  form] 
(LABEL  LL) 

(ASSERT  (EQ  ZZ  (TIMES  AA  ID)) 

(IF  (NEQ  II  BB) 

(BEGIN  (:=  ZZ  (PLUS  ZZ  AA)) 

(:=  II  (PLUS  II  (DECIMALNINTEGER  "1"))) 

(GOTO  LL  NIL})) 

(GOTO  LL  NIL)) 

29*(7)  [The  extraneous  GOTO  is  deleted] 

P 

(BEGIN  (:=  ZZ  4)  (:=  II  4)  (LABEL  LL)  (ASSERT  4)  (IF  4 4)) 

30»0K 

TPROG 

l40_REDO  VCG:SUBPROGRAM  [The  operation  VCG  is  redone  on  the 

TPROG  corrected  transduced  program] 

91_PP  TPROG#  [User  asks  to  see  the  all  the  VCs] 

[[IMPLIES  (AND  (GTQ  BB  (DECIMALNINTEGER  "0")) 

(ITEM  AA  (I  (DECIMALNINTEGER  "20”) 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  BB  (I  (DECIMALNINTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  ZZ  (I  (DECIMALNINTEGER  "20") 

S NIL  NIL  NIL) 

NIL  NIL) 

(ITEM  II  (I  (DECIMALNINTEGER  "20") 

U NIL  NIL  NIL) 

NIL  NIL)) 

(EQ  (DECIMALNINTEGER  "0") 

(TIMES  AA  (DECIMALNINTEGER  "0"] 
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(IMPLIES  (EQ  ZZ  (TIMES  AA  ID) 

(IF  (NEQ  II  BB) 

(EQ  (PLUS  ZZ  AA) 

(TIMES  AA  (PLUS  II  (DECIMALUNTEGER  "1")))) 

(EQ  ZZ  (TIMES  AA  BB] 

TPROG# 

This  concludes  the  dialog.  The  VCs  can  now  be  seen  by  Inspection 
to  be  correct.  Except  for  the  appearance  of  transduced  numeric  constant 
expressions  like  (DECIMALMNTEGER  "0"),  the  RPE/1  deductive  system  could 
easily  handle  the  proof  of  these  VCs.  The  RPE/2  system,  when  it  has  been 
completed,  will  be  able  to  deal  with  such  expressions,  treating  them  in 
accordance  with  JOVIAL  semantics  for  the  appropriate  data  types,  integer, 
floating-point,  and  fixed-point  numbers. 
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A HYPOTHETICAL  VERIFICATION  SCENARIO 

He  present  here  an  imaginary  dialog  between  a user  and  a semiautomatic 
verification  system.  This  represents  a possible  form  that  an  interactive 
program-writing,  debugging,  and  verification  system  might  take.  Many 
of  the  imagined  features  already  exist  (some  in  more  primitive  forms), 
others  exist  only  on  paper,  and  still  others  are  some  way  off  from 
realization. 

For  clarity  in  presentation  we  display  user  type-ins  in  upper-case 
characters,  and  most  system  responses  in  lower  case  (or  mixed  case). 

Text  enclosed  in  square  brackets  has  been  inserted  to  help  the  reader 
understand  what  is  supposed  to  be  happening.  The  system  uses  _,  ♦,  and  • 
as  prompt  characters.  The  user  starts  things  off  by  calling  the  program 
verifier  from  the  system  executive  (prompt  character  #). 

[Beginning  of  dialog] 

^VERIFIER 

Verifier  ready.  Type  ? for  options. 

_? 

Options: 

Parser:  type  P 

Verification  Condition  Generator:  type  VCG 
Deductive  System:  type  DS 

_P 

Cloading  JOCIT . PARSER> 

Type  F if  you  wish  to  parse  an  existing  JOCIT  file,  or  type  I(nput)  if 
you  wish  to  write  a new  program. 

_F  [the  user  opts  for  an  existing  file] 

Type  name  of  your  JOCIT  program  file:  ARRAYSCANNER 

<loading  USER. ARRAYSCANNER> 

< parsing  ARRAYSCANNER> 

Illegal  lexeme  found:  END 
at: 

BEGIN  XXsAA+BB  $ ZZsAA-BB  END!  [This  upper-case  text  is  printed 

TERM  $ by  the  system  to  provide  diagnos- 

tic help] 
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Parser  was  expecting  one  of:  $ • ♦ - ( 

What  3hould  be  done?  (type  ? for  options) 

? 

Options: 

R(eplace) 

T(eco)  for  TECO  text  editor 
TV  for  Display-oriented  editor 
A( bort ) 

Jt 

Type  string  to  replace  END  (in  double  quotes):  "$  END" 

Parse  successfully  completed.  Type  filename  if  you  wish  to  save  this 
parse,  else  type  <CR>. 


_AHRAYSC AN . PARSE  [User  asks  for  the  parsed  and 

transduced  program  to  be  saved 
under  the  name  he  types  in] 


File  has  been  made.  Type  ? for  options. 

_? 

Options: 

Type  V for  VCG 

Type  P for  a new  parse 

Type  O(uit)  to  exit 

_V 

<loading  VCG> 

VCG  is  ready.  Type  ? for  options.  [The  new  ready  symbol  + is 

indicative  of  the  VCG] 

♦ ? 

Type  L(ast)  if  you  want  to  generate  VCs  for  last  parse: 

type  F(ile)  if  you  want  to  VCG  another  existing  file. 

♦L 

generating  VCs  for  ARRAYSCAN . PARSE> 

VCs  completed.  Type  filename  or  <CR>:  [the  user  types  a carriage  return 

to  indicate  he  doesn't  wish  to 
save  these  VCs  now.] 

Do  you  wish  to  invoke  Deductive  System?  If  so,  type  DS:  else  ? for  other 
options. 

♦DS 

<loading  Deductive. System> 

Ready.  Type  L(ast)  for  last -generat ed  VCs,  or  ? for  other  options. 
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*L  (.•  Is  the  ready  symbol  of  the  DS} 

<Trying  to  prove  VCs  for  ARRAYSCAN. PARSE> 

Unable  to  prove  VC#3  (for-loop:  transduced  code  starts: 

(FOR  I (0  2 100 ) (ASSERT  T)  ...) 

Do  you  wish  to  modify  the  inductive  assertion? 

•Y 

Opt  ions: 

Type  E(dit)  to  invoke  LISP  editor 

Type  A(bort)  to  abort  proof  of  this  clause,  and  continue  proof 
Type  Q(uit)  to  proof  entirely 

•E 

Edit 

tapp 

(FOR  I (0  2 100) 

(ASSERT  T) 

(BEGIN  (:  = ZZ  (PLUS  ZZ  I)) 

(IF  (GT  ZZ  MAXZZ) 

(:=  ZZ  MAXZZ)))) 

•*R  T (AND  (LT0  ZZ  MAXZZMLTQ  0 I)(LTQ  I 100))) 

• •p 

(FOR  I (0  2 100)  (ASSERT  (AND  i 1 ()j  (BEGIN  A A))) 

••OK 

Continuing  proof  of  VCs  for  ARRAYSCAN> 

Found  embedded  subprocedure  "PROCESSELEMENTS"  inside  ARRAYSCAN. 

Do  you  wish  to  verify  subprocedure  now?  (type  N for  Now)  or 
(type  D for  Defer). 

•D 

Verification  of  subprocedure  PROCESSLELEMENTS  has  been  deferred. 

Continuing  proof  of  VCs  for  ARRAYSCAN> 

Proof  of  VCs  for  ARRAYSCAN  completed 

Subject  to  correctness  of  assertions  for  PROCESSELEMENTS. 

Do  you  wish  to  verify  PROCESSELEMENTS  now? 
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•Y 

Generating  VCs  for  PROCESSELEMENTS> 

VCG  of  PROCESSELEMENTS  completed. 

Verifying  VCs  for  PROCESSELEMENTS> 

Proof  of  VCs  for  ARRAYSCAN  completed 
No  outside  assumptions  used  in  proof. 

Do  you  wish  to  save  the  proof  trace? 

•Y 

Type  name  of  file:  (<CR>  defaults  to  progname.PROOFTRACE) 

[User  types  carriage  return] 

Saved  on  PROCESSELEMENTS. PROOFTRACE 
Other  verifications? 

•N 


VERIFIER  exited) 
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PROOF  OF  VCS  BY  THE  RPE/1  GOAL-DRIVEN  DEDUCTIVE  SYSTEM 

This  shows  a proof  trace  for  validity  of  the  three  VCs 
exhibited  directly  below,  as  the  value  of  TPROG#.  The  deductive 
sysem  used  here  is  a goal-driven  system  that  was  part  of  the 
RPE/ 1 ver i f ier  . 

18M_PP  TPROG# 

[(IMPLIES  ( GTO  BB  0) 

(EQ  0 (TIMES  AA  0 ) ) ) 

[IMPLIES  (EO  ZZ  (TIMES  AA  111) 

(IMPLIES  (NEO  II  BB) 

(EO  (PLUS  ZZ  AA) 

(TIMES  AA  (PLUS  II  1 ] 

(IMPLIES  (EO  11  (TIMES  AA  ID) 

(IMPLIES  (NOT  (NEO  II  BB) ) 

(EO  ZZ  (TIMES  AA  BB] 

TPROG# 

1 85_( FOR  CLAUSES  IN  TPROG#  (PROVE  CLAUSES))  [User  calls  for  proof  of 
Trying-to-prove: 

(IMPLIES  (GTO  BB  0) 

(EO  0 (TIMES  AA  0)  ) ) 

Equal ity-true-by-simplification 

IMPLIES-proved-TRUE-by-simplification 
Proof-is-successful ! 

Trying-to-prove: 

[IMPLIES  (EO  ZZ  (TIMES  AA  ID) 

(IMPLIES  (NEO  II  BB) 

(EO  (PLUS  ZZ  AA) 

(TIMES  AA  (PLUS  II  1] 

To-prove : 

[IMPLIES  (EO  ZZ  (TIMES  AA  ID) 

(OR  (EO  BP  ID 

(EO  (PLUS  AA  ZZ) 

(TIMES  AA  (PLUS  1 II] 

we-try-to-assert: 

(EO  ZZ  (TIMES  AA  ID  ) 

and-then-try-to-prove: 

[OR  (EO  BB  ID 

(EO  (PLUS  AA  ZZ) 

(TIMES  AA  (PLUS  1 II] 

(Asserting:  (EO  ZZ  (TIMES  AA  ID)) 

CONS ISTENCY-CH EC K-W AS-BYPASSED 
Consistent 
Trying-to-prove: 


the  3 VCs] 

[The  first  VC] 

[The  second  VC] 


A. 
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[OR  (EO  BB  II) 

(EQ  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 II] 

To- prove  : 

[IMPLIES  (NEO  BB  I 1 ) 

(EG  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 II] 
we-try-to-assert: 

(NEO  BB  II) 

and-then-try-to-prove: 

(EO  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 ID)) 

(Asserting:  (NEO  BB  I D ) 

CONSISTENCY-CHECK-WAS-EYPASSED 

Consistent 

Trying-to-prove: 

(EQ  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 ID)) 

Checking-if: 

(EQ  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 ID)) 
was-asserted-at-the-context: 

( D 10) 

or-at-some-parent-context 
It-was-not- previously-asserted 

Invoking-fast-transitivity-routine 

(Want  to  simplify:  (EO  (PLUS  AA  11)  (TIMES  AA  (PLUS  1 ID))  ?) 

YES 

ok 

Applying -SIMPLIFY-to: 

(EO  (PLUS  AA  11) 

(TIMES  AA  (PLUS  1 ID)) 

Simplified -to: 

(EO  (PLUS  AA  11) 

(PLUS  AA  (TIMES  AA  I1)D  [The  simplifier  uses 

Check! ng- 1 f : the  distributive  law] 

(EO  (PLUS  AA  11) 

(PLUS  AA  (TIMES  AA  ID)) 
was-asserted-at-the-context : 

(11  10) 

or-at- some- parent-context 
It-was-not-previously-asserted 

( Tr yi ng- to- ver i f y-an-EOU A LIT Y-rel a t ion  (EQ  (PLUS  AA  11)  (PLUS  AA  (TIMES 
AA  ID))) 

( Tr  y ing- to- pr  ove- re  1 at  ion  : (EO  (PLUS  AA  11)  (PLUS  AA  (TIMES  AA  ID)) 

Wan t - to- subst - for- some- subex pr- of : (PLUS  AA  11)  ?) 

YES 
ok 
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(type-in:  :SUBST:L<subexp',-to-be-replaced>  broken) 

186: SUBST : L ( ZZ  ) 

(1  POSSIBLE-SUBSTITUTIONS) 

(substitute  (TIMES  AA  II)  for  ZZ  in  ((PLUS  AA  ZZ))  ?) 

YES 

ok 

(PLUS  AA  (TIMES  AA  ID) 

187 : CO 

t y pe - in : : SUBST : L< subex pr- 1 o- be- rep  1 aced > s finished 
( For- sone- subex pr-of : (PLUS  AA  (TIMES  AA  II))  ?) 

NO 

refused-by-user 

Equality-true-by-slmplification 

Proof-is-successful! 


Proof-ls-successful ! 


Proof-is-successful  ! 


(This  completes  proof  of  VC2] 


Trying-to-prove: 

[IMPLIES  (EO  ZZ  (TIMES  AA  ID)  [The  third  VC] 

(IMPLIES  (NOT  (NEO  II  BB)  ) 

(EO  ZZ  (TIMES  AA  BB] 

To- prove : 

[IMPLIES  (EO  ZZ  (TIMES  AA  ID) 

(OR  (NEO  BB  ID 

(EO  ZZ  (TIMES  AA  BB] 
we-try-to-assert: 


(EO  ZZ  (TIMES  A*  ID) 


and-then-try-to-prove: 

(OR  (NEO  BB  ID 

(EO  ZZ  (TIMES  AA  BB)  ) ) 
(Asserting:  (EO  ZZ  (TIMES  AA  ID)) 
C0NSISTENCY-CHECK-WA5-BYPASSED 
Consistent 
Trying-to-prove: 

(OR  (NEO  BB  I D 

(EO  ZZ  (TIMES  AA  BB)  ) ) 
Checklng-if: 

(OR  (NEO  BB  I D 

(EO  ZZ  (TIMES  AA  BB)  ) ) 
was-asserted-at-the-context: 

(12) 

or-at- some- parent-context 

It -was- not -previously-asserted 

To-prove: 

(IMPLIES  (EO  BB  ID 

(EO  ZZ  (TIMES  AA  BB))) 
we-try-to-assert : 
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(EO  BB  ID 

and-then-try-to-prove: 

(EO  ZZ  (TIMES  AA  BBD 
(Asserting:  (EO  BB  ID) 

CONS  IS VENCY-CHECK-W AS-BYPASSED 

Consistent 

Trying-to-prove: 

(EO  11  (TIMES  AA  BE) ) 

Checking-if: 

(EQ  11  (TIMES  AA  BB) ) 
was-asserted-at-the-context: 

(13  12) 

or-at-some-parent-context 

It-was-not-previously-asserted 

Invoking-fast-transitivity-routine 

(Want  to  simplify:  (EQ  11  (TIMES  AA  BB))  ?) 

NO 

refused-by-user 

(Trying-to-verify-an-EQUALITY-relation  (EO  ZZ  (TIMES  AA  BB))) 
(Trylng-to-prove-relation : (EC  ZZ  (TIMES  AA  BB)) 

Wan t- t o- subs t - fo r- son e- subex Dr- o f : ZZ  ?) 

NO  [The  user  could  have  said  YES, 

refused-by-user  and  asked  for  a substn  on  ZZ] 

(For-some-subexpr-of:  (TIMES  A A BB)  ?) 

YES 

ok 

(type-in: :SUEST:P<subexpr>  broken) 

188: SUBST : R ( BB ) 

(1  POSSIBLE-SUBSTITUTIONS) 

(substitute  II  for  EB  in  ((TIMES  AA  BB))  ?) 

YES 

ok 

(TIMES  AA  II) 

1 89 : GO 

type- in :: SUBST: R<subexpr>  r finished 
Proof- is-successful! 

Proof-is-successful! 

Proof-is-suecessful i 

[This  completes  the  machine  proof  of  the  three  VCs] 
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4 

ABSTRACT 

This  paper  describes  a proqram  verifier  whose  basic  unit  of  verifi- 
cation is  a module  consisting  of  data  and  procedures.  The  module  hides 
the  data  by  allowing  the  data  to  be  changed  and  accessed  only  by  calling 
the  procedures.  In  specifying  a module  to  the  verifier,  the  data  is 
thought  of  abstractly  in  terms  of  information  content  rather  than  repre- 
sentation for  efficient  storage  and  retrieval.  The  abstract  data  is 
represented  by  a set  of  state  variables.  A notation  based  on  pure  LISP 
is  used  to  write  the  entry  and  exit  assertions  for  the  procedures  of  the 
module.  The  state  variables  can  contain  functions  as  well  as  other  forms 
of  data.  The  language  of  the  verifier  does  not  provide  for  explicitly 
' defining  the  type  of  the  data  in  the  state  variables.  Instead,  the  type 

of  the  data  is  determined  by  the  use  of  these  variables  in  the  entry  and 
exit  assertions.  The  actual  representation  of  data  is  described  by 
declaration  in  the  procedural  language  used  for  implementation.  The 
notation  based  on  pure  LISP  is  further  used  to  specify  the  mapping  from 
the  abstract  data  into  the  actual  data. 

The  method  used  by  the  verifier  is  based  on  Floyd's  ideas  of  program 
semantics  combined  with  the  approach  which  Boyer  and  Moore  used  to  prove 
theorems  about  LISP  programs. 

] 
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The  verifier  described  is  currently  being  implemented  at  System  Development 
Corporation  in  our  proprietary  system  CWIC  (Compiler  for  Writing  and  Imple- 
menting Compilers).  It  is  running  and  has  verified  some  procedures,  but 
it  does  not  have  all  the  features  described. 

The  paper  first  describes  the  verifier  from  the  standpoint  of  a user  by 
giving  a line  by  line  description  of  a sample  module.  It  then  presents 
an  overall  discussion  of  the  implementation. 
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1.  INTRODUCTION 

The  concept  of  a module,  as  used  in  this  paper,  is  that  of  a programming 
unit  which  consists  of  data  and  procedures.  It  hides  the  data  by  allowing 
it  to  be  changed  and  accessed  only  by  calling  its  procedures.  This  is 
similar  to  the  concept  of  a module,  as  expressed  by  Parnas[19],  and  to  the 
concept  of  a cluster,  as  described  by  Liskov  and  Zilles[15]. 

The  basic  ideas  which  are  used  in  verification  today  came  from  a paper  on 
the  semantics  of  programming  linguages  by  Floyd[6].  The  current  trend  of 
applying  these  ideas  began  with  a software  tool  called  a proqram  verifier 
by  King[14].  This  was  soon  followed  by  a number  of  other  verifiers  which 
are  described  in  the  following  papers:  Good[9],  Elspas,  Green,  Levitt, 

Waldinger[5] , Deutsch[4],  Schorre[22],  Good,  London,  Bledsoe[ll],  and 
Henke,  Luckham[12]. 

In  working  out  examples  for  such  verifiers,  the  author  has  found  it  awkward 
to  express  exit  assertions  in  terms  of  the  arrays  used  by  the  procedures. 

In  describing  a dictionary  module,  it  is  natural  to  say  that  the  procedure 
PUT  puts  information  into  the  dictionary  and  that  the  procecure  GET  gets 
it  out.  It  is  not  relevant  whether  the  dictionary  is  hash  coded  or  sorted 
in  ascending  order  for  a binary  search.  Nevertheless,  in  describing  the 
PUT  routine  to  any  of  the  aforementioned  verifiers  it  is  necessary  to 
describe  its  effect  on  the  actual  arrays  used  in  the  implemention.  The 
result  is  that  the  retrieval  algorithm  used  by  the  dictionary  module 
permeates  the  exit  assertion  for  PUT. 
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A similar  desire  for  abstract  specification  has  been  expressed  by  Good[10]. 
He  has  shown  examples  of  procedures  with  abstract  snecification,  but  not 
In  a form  for  automatic  processing  by  a verifier. 

An  abstract  specification  of  the  dictionary  module  will  now  be  given, 
followed  by  an  implementation  that  uses  the  linear  search  algorithm. 

This  specification  and  implementation  are  intended  as  input  to  the  verifier 
being  implemented  at  System  Development  Corporation  (SDC),  although  some 
changes  may  be  required  before  they  are  actually  processed.  Implementation 
using  other  algorithms  are  possible,  but  will  not  be  given  in  this  paper. 
The  verifier  is  explained  by  detailed  consideration  of  the  dictionary 
module  example.  The  input  to  the  verifier  is  given  in  Figures  1 through  3. 
Figure  1 is  the  specification  of  the  module  in  terms  of  abstract  data. 
Figure  2 showns  the  declarations  for  the  concrete  data  used  in  the  linear 
search  implementation,  and  a mapping  from  the  abstract  data  onto  the  con- 
crete data.  Figure  3 shows  the  routines  of  this  implementation  written 
In  the  procedural  language  used  by  the  verifier.  These  three  figures  are 
explained  in  Section  2 following. 
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.MODULE 

.SPECIFICATION 

.STATE  DICTIONARY,  STACK; 

.INITIAL  DICTIONARY  = .LAMBDA  X,  Y (0)  A STACK  = .NIL; 
.PROCEDURE  PUSH  (); 

.ENTRY  .TRUE: 

.EXIT  DICTIONARY '=DICTIONARY  & STACK* =DICTIONARY  " STACK; 
.PROCEDURE  POP  (): 

.ENTRY  STACK  t .NIL; 

.EXIT  DICTIONARY ' =STACK.  1 & STACK' =STACK:2 ; 

.PROCEDURE  PUT(F , X,  Y): 

.ENTRY  .TRUE; 

.EXIT  STACK'  = STACK  & 

DICTIONARY'  = .LAMBDA  I,  J( 

.IF  I=F  & J=X 
.THEN  Y 

.ELSE  DICTIONARY  (I,  J)  .END); 

.PROCEDURE  GET(F,  X;  V) : 

.ENTRY  .TRUE; 

.EXIT  STACK'=STACK  & DICTIONARY ' =DICTIONARY  & 

Y ' =D I CT IONARY ( F , X); 


V.  Schorre 
Figure  1. 
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.IMPLEMENTATION 

.DECLARE  DICT  F[ 1000] , DICT  X[ 1000] , DICT  Y[1000],  STACKX[100], 
RUN,  LEVEL; 

.DEFINE 

SEARCH(F,  X,  N)  = 

.IF  N < a 
.THEN  0 

.OR  IF  F=DICT_F[N]  & X=DICT_X[N] 

.THEN  DICT_Y[N]  - 
.OR  IF  .TRUE 

.THEN  SEARCH (F , X,  N-l)  .END; 

TAIL ( W)  = 

.IF  W=0 

.THEN  .NIL 

.ELSE  .LAMBDA  F,  X( 

SEARCH(F,  X,  STACKX[W]-1 ) ) " 

TAIL(W-l)  .END); 

.MAP 

DICTIONARY  = .LAMBDA  F,  X(SEARCH(F,  X,  RUN-1)); 

STACK  = TAIL(LEVEL) ; 


V.  Schorre 
Figure  2. 
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.ASSERT  0£RUN< 1 000  & 0<LEVEL<100  & 

V I(0<I<LEVEL  > 0<STACK[ I]< 1000) ; 

.INITIAL 

RUN  :=  0;  LEVEL  :=  0;  STACKX[0]  :=  0; 

. RETURN 

.PROCEDURE  PUSH: 

.ERROR  UNLESS  LEVEL  < 99; 

LEVEL  :=  LEVEL  + 1;  STACKX[ LEVEL]  :=  RUN; 

.RETURN 

.PROCEDURE  POP: 

RUN  :=  STACKXt LEVEL];  LEVEL  :=  LEVEL  - 1; 

. RETURN 

.PROCEDURE  PUT: 

.DECLARE  H; 

.ERROR  UNLESS  RUN  < 999; 

H :=  STACKX[ LEVEL]; 

.LOOP  V I(STACKX[LEVEL]a<H'  > F^DICT_F[I]  I X^DICT_X[ I] ) & 
STACKX[LEVEL]<H’ 

.WHILE  H<RUN+1  & (F/DICT_F[H]  & X^DICT_X[H])  .DO 
H :=  H + 1;  .END 
.IF  H = RUN+1 

.THEN  DICT_Y[H]  :=  Y; 

.ELSE  DICT_F[RUN]  :=  F;  DICT_X[RUN]  :=  X; 

DICT_Y[RUN]  :=  Y:  RUN  :=  RUN+1;  .END 

.RETURN 

.PROCEDURE  GET: 

.DECLARE  H; 

H :=  RUN  - 1 ; 

.LOOP  V 1(H'<1<RUN-1  > F^D1CT_F[1]  1 X/*D1CTJ[I])  & 

H ' < RUN- 1 

.WHILE  0<H  & (F^DICT_F[H]  I X*DICT_X[H] ) .DO 
H :=  H-l ; .END 

.IF  F=DICT  F[H]  i X=DICT  X[H] 

.THEN  Y :=  DICT  Y[H] ; 

.ELSE  Y :=  0;  .END 
.RETURN 
.FINISH 


V.  Schorre 
Figure  3. 
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2 . EXAMPLE  OF  THE  DICTIONARY  MODULE 

2.1  INFORMAL  DESCRIPTION  OF  THE  PROBLEM 

The  dictionary  of  the  example  used  in  this  paper  is  like  a function  whose 
values  are  specified  one  at  a time  by  the  user.  When  the  user  wishes  to 
assign  the  value  Y to  the  function  F(X),  he  calls  the  procedure  PUT(F,  X,  Y). 
When  he  wishes  to  get  the  value  of  F(X),  which  he  has  previously  assigned 
by  using  the  PUT  procedure,  and  store  it  in  Y,  he  calls  the  procedures 
GET(F,  X;  Y).  There  are  two  additional  procedures,  PUSH  and  POP.  PUSH 
remembers  the  current  state  of  the  dictionary,  but  does  not  alter  it  in 
any  way;  POP  restores  the  state  of  the  dictionary  to  the  state  in 
which  it  was  at  the  corresponding  PUSH.  If  more  than  one  value  has  been 
assigned  to  F(X),  GET  returns  the  last  such  value  that  has  not  been  removed 
by  POP.  If  no  value  is  currently  entered  for  F(X),  GET  returns  zero. 

2.2  SPECIFICATION  OF  THE  DICTIONARY  MODULE  IN  TERMS  OF  ABSTRACT  DATA 
Figure  1 gives  the  dictionary  module  specification.  No  attempt  is  made  to 
give  a complete  definition  of  the  language  in  which  this  example  is 
written;  rather,  this  example  specification  will  be  discussed  line  by  line. 
.MODULE 

.SPECIFICATION 

The  key  word  .MODULE  signals  the  verifier  to  begin  processing  a new  module. 
This  key  word  is  always  followed  by  the  key  word  .SPECIFICATION,  because 
the  first  information  that  the  verifier  needs  about  a module  is  its  specifi- 
cation. 
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.STATE  DICTIONARY,  STACK; 

This  module  has  two  state  variables,  DICTIONARY  and  STACK.  The  variable 
DICTIONARY  contains  a function  with  two  arguments.  Because  DICTIONARY 
is  a state  variable,  it  may  contain  a different  function  after  execution 
of  a procedure  than  it  did  before  execution.  Consider  the  procedure  call 
PUT(F,  X,  Y).  After  its  execution  the  value  of  the  function  in  the 
variable  DICTIONARY  for  the  arguments  F and  X is  Y.  This  can  be  expressed 
by  the  notation  DICTIONARY(F , X)  = Y.  F may  be  thought  of  as  an  attri- 
bute, X as  an  object  with  that  attribute,  and  Y as  the  value  of  that 
attribute.  The  variable  STACK  always  contains  a list  of  functions,  each 
having  two  arguments. 

Type  declarations,  like  those  in  the  language  PASCAL  described  by  Wirth[27], 
would  be  a useful  feature  to  add  to  the  verifier.  Using  them,  the  information 
given  in  the  preceeding  paragraph  could  be  expressed  formally  to  the 
verifier.  This  feature  has  not  been  included  in  the  present  version  of  the 
verifier  because  it  is  not  needed  to  accomplish  the  objective  of  making  the 
specification  independent  of  the  implementation. 

.INITIAL  DICTIONARY  = .LAMBDA  X,  Y(0)  & STACK  = .NIL; 

Each  module  contains  an  initialization  procedure  which  will  be  executed 
after  the  entire  program  has  been  loaded  and  before  the  main  program  is 
executed.  The  purpose  of  this  procedure  is  to  initialize  the  data  protected 
by  the  module.  The  expression  above  will  be  true  immediately  after  the 
initial  procedure  for  the  module  has  been  executed.  The  value  of  the  state 
variable  DICTIONARY  is  a function  of  two  arguments  which  always  returns  0. 

The  value  of  the  state  variable  STACK  is  the  empty  list,  which  in  the 
language  of  the  verifier  is  denoted  by  .NIL. 
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.PROCEDURE  PUSH ( ) : 

.ENTRY  .TRUE; 

.EXIT  DICTIONARY '^DICTIONARY  & STACK'=OICTIONARY  " STACK; 

This  is  the  specification  of  the  procedure  PUSH.  The  dictionary  being 
described  has  a push-down  stack  capability,  which  would  be  useful  in 
compiling  a block-structured  language  such  as  ALGOL  60.  Each  procedure 
is  described  by  an  entry  condition  and  an  exit  condition.  When  the 
verifier  processes  the  implementation  of  a procedure,  it  checks  to  see 
that  if  the  entry  condition  is  true  immediately  before  the  procedure 
Is  entered,  then  the  exit  condition  will  be  true  immediately  after  it  is 
exited.  When  processing  a call  to  a procedure,  the  verifier  checks  to  see 
that  the  entry  assertion  is  satisfied,  and  then  assumes  the  exit  assertion. 
Thus  the  entry  assertion  represents  some  restriction  on  the  arguments  of 
the  procedure,  as  well  as  restrictions  on  global  data  at  the  time  of  call. 
The  procedure  PUSH  has  no  restriction,  so  its  entry  assertion  is  .TRUE. 

The  exit  assertion  involves  the  comparison  of  the  state  variables  at  two 
points  in  time--before  entry  into  the  procedure  and  after  exit  from  the 
procedure.  The  state  variable  without  the  prime  refers  to  the  value  of 
that  variable  before  entry  to  the  procedure.  The  state  variable  with 
the  prime  refers  to  the  value  of  that  variable  after  exit  from  the  pro- 
cedure. Thus  DICTIONARY1  = DICTIONARY  means  that  the  value  of  the  state 
variable  DICTIONARY,  after  exit  from  the  procedure,  is  the  same  as  it  was 
before  entry.  The  next  part  of  the  exit  assertion  is  as  follows: 

STACK1  = DICTIONARY  11  STACK 
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The  double  quote(")  is  used  as  the  infix  form  of  the  LISP  function  CONS. 
The  final  value  of  the  variable  STACK  is  the  list  obtained  by  putting  the 
original  value  of  the  dictionary  onto  the  front  of  the  list  which  was 
originally  in  STACK. 

.PROCEDURE  POP ( ) : 

.ENTRY  STACK  f .NIL; 

.EXIT  DICTIONARY ' =STACK. 1 & STACK'=STACK:2; 

Notice  that  the  procedure  POP  has  a restriction  on  entry.  It  would  not 
make  sense  to  pop  a stack  if  it  were  empty  to  begin  with.  The  exit 
assertion  makes  use  of  the  post-fix  operator  .1,  which  is  the  LISP  func- 
tion CAR,  in  other  words,  the  first  element  of  the  list.  The  post-fix 
operator  :2  is  the  LISP  function  CDR,  in  other  words,  the  remainder  of 
a list  after  the  first  element  has  been  removed.  The  affect  of  the  pro- 
cedure POP  is  to  set  the  state  variable  DICTIONARY  to  the  first  element 
of  the  list  in  STACK,  and  to  set  STACK  to  the  rest  of  that  list. 
.PROCEDURE  PUT(F,  X,  Y): 

.ENTRY  .TRUE; 

.EXIT  STACK'  = STACK  & 

DICTIONARY'  = .LAMBDA  I,  J( 

.IF  I=F  & J=X 


-THEN  Y 

•ELSE  DICTIONARY  (I,  J)  .END); 
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The  procedure  PUT  has  no  restriction  on  entry.  On  exit,  the  value  of  the 
state  variable  STACK  is  unchanged.  The  value  cf  the  state  variable 
dictionary  upon  exit  Is  sped  find  by  .1  lambda  expression.  A lambda 
expression  is  a way  of  denoting  a function.  The  variables  I and  >J  are 
local  to  the  lambda  expression,  and  represent  the  two  arguments  of  the 
function  being  expressed.  The  value  returned  by  this  function  Is  given 
by  a conditional  expression.  If  the  arguments  of  this  function  are  the 
same  as  the  first  two  arguments  of  the  procedure  PUT,  then  the  value  of 
this  function  will  be  the  third  argument  of  PUT,  otherwise,. the  value 
will  be  the  same  as  for  the  function  which  was  in  the  state  variable 
DICTIONARY  on  entry  to  PUT.  This  use  of  a lambda  expression  and  a con- 
ditional expression  will  seem  natural  to  the  reader  who  is  already 
familiar  with  LISP. 

.PROCEDURE  GET ( F , X;  Y): 

.ENTRY  .TRUE; 

.EXIT  STACK ' =STACK  & DICTIONARY ' =DICTIONARY  & 

Y'*DICTIONARY(F,  X); 

The  procedure  GET  has  two  formal  input  parameters,  F and  X,  and  a formal 
input/output  parameter  Y.  This  specification  does  not  completely  meet  the 
goal  of  being  implementation  independent.  The  language  in  which  the 
dictionary  module  is  to  be  implemented  allows  procedures  to  have  two  types 
of  parameters --input  parameters  and  input/output  parameters.  The  input 
parameters  are  written  first  and  separated  from  the  i nput/output  parameters 
by  a semi-colon  (;).  If  there  are  no  input/output  parameters,  then  the 
semi-colon  is  omitted,  as  in  the  case  of  the  procedures  already  discussed. 

A procedure  can  access  the  value  of  its  input  parameters,  but  cannot  change 
their  value.  A procedure  can  both  access  and  change  the  value  of  its  input/ 
output  parameters. 
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The  concept  of  module,  as  used  in  this  paper,  includes  the  notion  that  the 
procedures  in  the  module  form  its  interface  with  the  outside  world.  In 
specifying  a module,  its  interface  is  also  specified.  This  interface  can 
be  language  dependent  and  to  a lesser  degree  implementation  dependent.  A 
module  which  has  been  specified  as  containing  certain  procedures  could  not 
be  implemented  in  a version  of  COBOL  which  does  not  recognize  the  concept 
of  procedure.  The  procedure  GET  could  not  be  implemented  according  to  the 
above  specifications  ina  language  which  allows  only  input  parameters, 
although  the  specification  could  be  easily  modified  to  allow  implementation 
in  such  a language. 

According  to  the  exit  assertion,  the  state  variables  STACK  and  DICTIONARY 
are  unchanged  by  the  procedure  GET.  The  input/output  parameter  Y is  set 
to  the  value  obtained  by  applying  the  function  contained  in  the  state 
variable  DICTIONARY  to  the  two  input  parameters  F and  X. 

2.3  MAPPING  FROM  ABSTRACT  DATA  INTO  CONCRETE 
This  section  describes  Figure  2 line  by  line. 

.IMPLEMENTATION 

This  line  marks  the  beginning  of  that  part  of  the  module  which  is  truly 
implementation  dependent. 

.DECLARE  DICT  F[ 1000]  , DICT  X[ 1000]  , DICT  YflOOO],  STACKX[ 1 00 ] , RUN,  LEVEL; 
This  is  a declaration  in  the  procedural  language  in  which  the  module  is 
being  implemented.  The  only  data  which  the  language  processes  are  inteqers 
and  one-dimensional  arrays  of  integers.  Thus,  RUN  and  LEVEL  are  integer 
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variables.  DICT  F,  DICTX,  and  DICTY  are  one-dimensional  arrays  of 
integers, each  1000  elements  long.  STACKX  is  a one-dimensional  array 
of  100  integers.  Array  indexing  begins  with  0 and  goes  to  one  less 
than  the  number  of  elements  in  the  array.  RUN  and  LEVEL  are  simple 
variables  with  integer  values.  Figure  4 shows  how  information  is 
represented  in  terms  of  the  concrete  data  of  the  module.  The  dictionary 
is  represented  by  the  parallel  arrays  DICTF,  DICT  X,  and  DICTY.  Items 
are  entered  into  these  arrays  one  after  another.  RUN  is  a pointer  to 
the  next  available  location  in  these  arrays.  STACKX  is  a stack  of  values 
of  RUN,  which  is  used  by  PUSH  and  POP  for  saving  and  restoring  the 
dictionary.  LEVEL  is  a pointer  to  the  last  item  used  in  STACKX. 

Figure  4 

An  abstract  dictionary  is  represented  by  the  information  in  the  parallel 

\ 

arrays  above  some  pointer.  Thus  the  function  stored  in  DICTIONARY  is 
represented  by  all  the  information  in  the  parallel  arrays  above  the 
pointer  RUN.  The  list  of  functions  in  STACKX  is  represented  by  the  infor- 
mation in  the  parallel  arrays  above  the  corresponding  pointers  in  STACKX. 

Of  course,  STACKX  is  upside  down  with  respect  to  STACK.  The  top  element 
in  STACKX  points  to  the  top  elements  in  the  paral lei  arrays  so  that  there 
is  no  information  above  it.  This  pointer  represents  the  function  that 
always  returns  0 and  which  is  at  the  bottom  of  the  list  in  STACK. 
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If  the  reader  refers  back  to  Figure  2,  he  will  see  it  divided  into  three 
parts.  The  first  part  is  the  declarations  discussed  above.  The  second 
part  is  a group  of  function  definitions.  The  third  part  is  the  definition 
of  the  state  variables  DICTIONARY  and  STACK  in  terms  of  the  variables 
which  were  declared  in  the  first  part.  The  functions  defined  in  the 
second  part, refer  to  the  variables  declared  in  the  first  part  and  are 
used  to  define  the  state  variables  in  the  third  part.  The  reader  should 
keep  this  overview  in  min'd  while  reading  the  following  continuation  of 
the  line  by  line  description. 


.DEFINE 


This  line  marks  the  beginning  of  a set  of  definitions. 


SEARCH(F,  X,  N)  = 

.IF  N < 0 
.THEN  0 

.OR  IF  F=DICT_F[N]  4 X=DICT_X[N] 
.THEN  DICT_Y[N] 

.OR  IF  .TRUE 

.THEN  SEARCH! F,  X,  N-l)  .END; 
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The  function  SEARCH  searches  the  parallel  arrays  DICT  F and  DICT  X in 
last-in/first-out  order,  beginning  with  the  N-th  entry.  If  it  does  not 
find  a match,  it  returns  0.  If  it  does  find  a match,  it  returns  the 
corresponding  entry  from  the  array  DICT  Y,  otherwise,  it  calls  itself 
recursively  to  continue  the  search. 

TAIL(W)  = 

.IF  W=0 

.THEN  .NIL 

.ELSE  .LAMBDA  F,  X( 

SEARCH  (F,  X,  STACKX[W]-1 ) ) " 

TAIL (W-l ) .END; 

TAIL(W)  is  the  list  of  the  bottom  W dictionaries  on  the  stack.  The  argu- 
ment W must  be  a non-negative  integer.  When  the  function  is  called  with 
0,  it  returns  the  empty  list  .NIL.  When  it  is  called  with  a positive 
integer  W,  it  returns  a list  which  it  constructs  by  putting  a function 
expressed  as  a lambda  expression  onto  the  front  of  the  list  obtained  by 
calling  itself  recursively  with  W-l.  The  function  represents  the  W-th 
dictionary  and  is  constructed  by  calling  the  function  SEARCH  to  search 
the  parallel  arrays. 


.MAP 

The  keyword  .MAP  marks  the  beginning  of  the  definitions  of  the  abstract 
state  variables  in  terms  of  the  concrete  data. 
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DICTIONARY  = .LAMBDA  F,  X(SEARCH(F,  X,  RUN-1)); 

The  state  variable  DICTIONARY  is  a function  expressed  as  a lambda 
expression.  The  recursive  function  SI  ARlII  Is  culled  lit  se.ii  c It  I be 
parallel  arrays  for  F and  X beginning  just  above  the  pointer  RUN. 

STACK  = TAIL(LEVEL) ; 

The  state  variable  STACK  is  defined  by  calling  the  recursive  function 
TAIL,  giving  it  the  pointer  to  the  last  element  in  STACKX  as  argument. 

2.4  IMPLEMENTATION  OF  THE  DICTIONARY  MODULE  IN  A PROCEDURAL  LANGUAGE 
This  section  describes  Figure  3 in  detail. 

.ASSERT  CKRUN<1000  & (KLEVEL<100  & 

V I(0<I<LEVEL  > 0<STACKX[l]sl000); 


This  is  an  assertion  about  the  concrete  data  of  the  module.  It  is  to  hold 
true  immediately  after  initialization,  and  if  it  holds  true  just  before 
any  procedure  of  the  module  is  entered,  then  it  will  hold  true  just  after 
that  procedure  is  exited.  In  processing  the  module,  the  verifier  will 
prove  that  this  assertion  holds  true  immediately  after  initialization. 

It  will  assume  that  it  holds  true  just  before  entering  a procedure,  and 
prove  that  it  holds  true  just  after  exiting.  When  processing  a call  to 
one  of  the  procedures  of  this  module,  the  verifier  will  prove  that  this 
assertion  holds  before  the  call,  and  then  assume  that  it  holds  afterwards. 
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This  assertion  is  similar  to  the  inductive  assertions  on  loops  which  are 
discussed  later  in  this  paper. 

.INITIAL 

RUN  :=  0;  LEVEL  :=  0;  STACKX[0]  :=  0; 

.RETURN 

This  is  the  code  to  initialize  the  module.  It  will  be  executed  after  the 
program  is  executed,  but  before  control  goes  to  the  main  program.  The 
verifier  proves  that  both  the  initial  assertion  and  the  assertion  about 
concrete  data  hold  true  after  execution  of  this  code. 

.PROCEDURE  PUSH: 

.ERROR  UNLESS  LEVEL  < 99; 

LEVEL  :=  LEVEL  + 1;  STACKX [LEVEL]  :=  RUN; 

.RETURN 


This  is  written  in  a typical  procedural  language.  The  meaning  of  the  error 
statement  is  to  abort  if  the  boolean  expression  LEVEL  < 99  is  not  true. 

The  verifier  is  only  proving  that  the  program  meets  its  specification  if  it 
terminates  normally.  Nothing  is  Said  about  the  case  in  which  the  program 
aborts  or  gets  into  an  infinite  loop. 
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It  is  interesting  to  compare  the  entry  assertion,  which  is  part  of  the 
specification,  to  the  error  statement  in  the  procedure.  The  error  state- 
ment describes  an  implementation  restriction  on  the  procedure . The  entry 
assertion  is  checked  by  the  verifier,  and  has  no  code  generated  for  it. 

The  error  expression  is  assumed  by  the  verifier,  because  the  compiler 
generates  code  to  abort  if  it  is  not  the  case.  The  error  statement  is 
usually  used  to  express  a storage  limitation.  In  this  case,  the  limit 
is  a maximum  of  100  elements  in  the  stack.  If  this  module  had  been 
implemented  in  a language  with  dynamic  storage  allocation,  it  might  have 
requested  additional  storage  for  the  stack  and  left  it  up  to  the  storage 
allocation  routine  to  abort  in  case  there  was  insufficient  space  available. 
It  would  not  have  been  appropriate  to  describe  in  the  specification  of 
PUSH  what  error  message  would  be  generated  in  case  of  insufficient  storage, 
because  that  would  have  forced  an  early  decision  about  whether  the  stack 
was  to  be  allocated  as  a fixed  block  or  in  some  general  pool. 

.PROCEDURE  POP: 

RUN  :=  STACKX[LEVEL] ; LEVEL  :=  LEVEL-1; 

.RETURN 

This  procedure  contains  no  new  features,  and  needs  no  special  explanation. 
.PROCEDURE  PUT: 

The  formal  parameters,  F,  X,  and  Y,  are  not  shown  here  since  they  have 
already  been  given  i n the  speci fication  of  PUT. 
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.DECLARE  H; 

This  is  a declaration  of  the  local  variable  H as  a simple  integer  variable. 

.ERROR  UNLESS  RUN  < 999; 

H :=  STACKX[ LEVEL] ; 

The  error  statement  has  already  been  discussed,  and  the  assignment  is 
similar  to  those  which  have  been  seen  before. 

.LOOP  V I(STACKX[LEVEL]<I<H'  > WICT_F[I]  I X/DICT_X[ I] ) & 
STACKX[LEVEL]<H' 

.WHILE  H<RUN+1  & (Fj«DICT_F[H]  & X/DICT_X[H])  .DO 
H :=  H+l ; .END 

/ 

/ 

This  is  the  first  loop  statement  that  has  been  encountered.  It  differs 
from  the  usual  loop  statement  of  a procedural  language  in  that  there  is 
an  inductive  assertion  between  the  key  words  .LOOP  and  .WHILE,  as  well  as 
the  usual  loop  termination  condition  between  the  key  words  .WHILE  and  .DO. 
The  inductive  assertion  is  like  a bookean  expression,  but  it  can  also 
contain  universal  and  existential  quantifiers.  Ilhprimed  variables  in  the 
inductive  assertion  represent  their  value  before  the  loop  was  first 
entered;  primed  variables  represent  their  value  at  some  arbitrary  time 
when  control  has  reached  the  top  of  the  loop.  The  verifier  must  prove 
that  the  inductive  assertion  is  true  when  control  first  enters  the  loop, 
and  that  if  it  is  true  at  the  beginning  of  some  arbitrary  time  through 
the  loop,  then  it  will  be  true  at  the  end. 
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In  a future  version  of  the  verifier,  we  hope  to  eliminate  much  of  the 
need  for  the  user  to  write  inductive  assertions.  One  way  to  do  this  is 
to  have  them  automatically  generated  by  the  verifier,  using  techniques 
such  as  those  described  by  Wegbreit[26] . 

Another  approach  is  to  use  high  level-language  constructs,  as  was 
suggested  by  Gerhart[7],  although  not  necessarily  APL  constructs.  For 
example,  the  author  has  found  that  the  programmer  can  supply  most  of 
the  invariants  needed  to  prove  that  subscripts  are  within  bounds  by 
specifying  subranges  in  the  declaration  of  variables.  This  subrange 
feature  is  currently  ii  a number  of  languages,  such  as  PASCAL,  which  is 
described  by  Wirth[27]. 

The  remainder  of  the  procedure  PUT,  as  well  as  the  entire  procedure  GET, 
involves  nothing  new,  so  the  line  by  line  discussion  need  not  be  continued. 

3.  THE  VERIFIER-CURRENT  STATUS 

The  preceeding  example  is  to  be  processed  by  a verifier  which  combines  the 
Floyd  approach  of  inductive  assertions  with  the  techniques  of  proving 
theorems  about  recursive  functions,  as  originated  by  Boyer  and  Moore[2], 
and  extended  by  Moore[18].  As  in  the  Floyd  approach,  procedures  are 
described  by  entry  and  exit  assertions.  This  is  appropriate,  because  the 
objective  is  to  handle  procedures  which  have  side  effects  on  global  data. 
The  current  verifier  further  follows  the  Floyd  approach  by  requiring 
that  the  user  supply  inductive  assertions.  The  Floyd  approach  involves  a 
symbolic  execution--or  possibly  a backward  substi tution--between  each  pair 
of  assertions.  Deutsch[4]  improved  this  by  doina  a symbolic  execution 
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beginning  at  each  assertion.  When  a branch  point  is  reached,  the  execution 
splits.  Thus  common  initial  parts  of  paths  are  handled  together.  SDC's 
verifier  does  a single  symbolic  exeuction  of  an  entire  procedure.  It 
evaluates  expressions  in  terms  of  the  values  which  the  variables  had  upon 
entry  to  the  procedure,  or--in  case  of  variables  killed  in  !oops--in 
terms  of  the  value  which  the  variable  had  at  the  beginning  of  some 
arbitrary  execution  of  the  loop. 

SDC's  verifier  uses  a hash  coding  technique,  similar  to  that  of  Deutsch[4], 
to  build  information  as  it  does  the  symbolic  execution.  Deutsch  must 
clear  his  dictionary  before  he  starts  a symbolic  exeuction  at  a new 
assertion.  When  SDC's  verifier  reaches  the  inductive  assertion  at  the 
beginning  of  the  loop,  it  generates  new  values  for  the  variables  killed 
in  the  loop.  This  is  a technique  taken  from  global  program  optimization 
as  described  by  Cocke  and  Schwartz[3]  pages  252-283.  No  information  need 
be  removed  from  the  dictionary  at  this  point.  Information  about  the 
variables  killed  in  the  loop  becomes  irrelevant  due  to  the  new  value  of 
these  variables.  Information  not  involving  killed  variables  remains  in  the 
dictionary  and  does  not  have  to  be  re-entered,  as  with  Deutsch's  verifier. 

SDC's  verifier  accepts  only  structured  programs , thaf  is,  programs  without 
GO  TO  statements.  The  reader  will  recall  that  SDC's  method  of  symbolic 
execution  does  require  a knowledge  of  loop  structure  because  it  assigns 
new  values  to  variables  killed  in  loops.  If  one  wished  to  encourage  bad 
programming  practice,  one  could  program  the  verifier  to  discover  loops  by 
interval  analysis.  This  type  of  analysis  has  been  described  by  Sites[23] 
page  7-16  in  connecton  with  program  verification.  It  is  described  in  more 
detail  in  relation  to  global  program  optimization  by  Cocke  and  Schwartz[3] 
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pages  284-303  and  in  still  greater  detail  by  Schaefer[21]  chapters  3-7. 
This  latter  book  gives  a very  rigorous  mathematical  treatment. 

4.  CONCLUSIONS 

The  author  believes  that  a verifier  is  an  important  tool  to  be  used  where 
exceptional ly  high  reliability  is  demanded.  Prograroners  using  this  tool 

I 

will  need  enough  background  in  mathematics  or  symbolic  logic  to  understand 
what  a proof  is. 

There  has  been  a great  deal  of  progress  over  the  past  few  years  in  the 
status  of  program  verification  software.  This  paper  has  described  one 
of  these  tool  as  it  is  currently  being  implemented  at  SDC. 
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ABSTRACT: 

Hierarchical  programming  is  being  increasingly  recognized  as 
helpful  in  the  construction  of  large  programs.  Users  of  hierarchical 
techniques  claim  or  predict  substantial  increases  in  productivity  and 
in  the  reliability  of  the  programs  produced.  In  this  paper  we  describe 
a formal  method  for  hierarchical  program  specification,  implementation, 
and  proof.  We  apply  this  method  to  a significant  list  processing  problem 
and  also  discuss  a number  of  extensions  to  current  programming  languages 
that  ease  hierarchical  program  design  and  proof. 
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1 . INTRODUCTION 


Hierarchical  programming  is  being  increasingly  recognized  as  helpful  in 
the  construction  of  large  programs.  Users  of  hierarchical  techniques  claim  or 
predict  substantial  increases  in  productivity  and  in  the  reliability  of  the 
programs  produced.  In  this  paper  we  describe  a formal  method  for  hierarchical 
program  specification,  implementation,  and  proof.  We  apply  thi3  method  to  a 
significant  list  processing  problem  and  al3o  discuss  a number  of  extensions  to 
current  programming  languages  that  ease  hierarchical  program  design  and  proof. 
Tne  reader  may  consult  [16]  for  a more  detailed  explanation  of  the  method  and 
[17]  for  a presentation  of  its  application  in  designing  a secure  operating 
system. 

Most  contemporary  programming  languages  provide  some  facilities  for 
creating  structured  programs,  for  example,  subroutines,  conditional 
statements,  and  blocks,  that  can  help  to  decompose  a programming  task  into  a 
set  of  subtasks.  It  is  shown  in  [6]  that  the  proof  of  a hierarchically 
structured  program  can  be  divided  into  showing  that  the  parts  satisfy  their 
specifications  and  then,  using  just  the  specifications  of  the  parts,  proving 
the  correctness  of  the  whole  program.  Of  course,  this  process  can  be  repeated 
if  the  parts  are  similarly  decomposable  into  subparts.  (Note  that  the 
principle  of  recursion  induction  [9]  asserts  that  this  technique  applies  to 
"esursive  procedures  as  well.) 

Syntactically  structuring  program  text,  30  tint  concise  mathematical 
specifications  of  textual  fragments  can  avoid  irrelevant  implementation 
details  (3ueh  as  the  values  of  temporary  variables),  is  certainly  a good  idea. 
Unfortunately,  with  pure  textual  abstraction  the  specification  of  a fragment 
must  state  the  relation  between  the  fragment's  inputs  and  outputs  in  terms  of 
the  concrete  data  structures  of  the  language.  This  is  unfortunate  since,  for 
example,  a manipulation  of  a binary  tree  has  to  be  specified  in  terns  of  the 
representation  of  the  tree,  perhaps  an  array.  The  design  and  proof  process  is 
surely  facilitated  if  such  manipulations  are  specified  in  terms  of  the 
primitive  operations  meaningful  for  trees — inserting  nodes,  accessing 
designated  descendants  of  a node,  etc.  The  general  principle  here  is  that 
implementation  ought  to  be  separated  from  specification. 

Moreover,  in  a pregram  organized  as  a set  of  procedures,  there  is  often 
lata  tnat  must  be  shared  by  several  procedures.  If  3uch  a group  of  procedures 
is  correct,  then  surely  each  of  them  maintains  shared  data  correctly.  But, 
even  if  authorized  procedures  do  correctly  maintain  shared  data,  in  most 
languages  there  is  no  easy  way  to  preclude  other  access  to  the  data;  erroneous 
modification  of  the  common  data  by  other  parts  of  a large  program  remains 
possible.  Thus,  as  Morris  points  out  [12],  "...we  are  only  interested  in 
those  properties  of  a subprogram  which  are  invariant  over  all  the  possible 
program  texts  into  which  the  subprogram  might  be  inserted."  In  the  absence  of 
constructs  that  forbid  erroneous  external  modification,  these  invariant 
properties  must  depend  on  the  validity  of  the  common  data.  If  erroneous 
external  mollification  is  impossible,  then  the  difficulty  of  proving  a small 
part  of  a large  program  is  a function  of  the  size  of  the  part  rather  than  of 
the  size  of  the  whole. 
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A number  of  modern  programming  languages,  including  Simla  [1],  Clu  [7], 
ELI  [16],  and  Alpnard  [20],  either  provide  (or,  in  the  cases  of  Simula  and 
ELI,  can  be  easily  extended  to  provide)  constructs  that  limit  access  to  common 
data  in  a practice1  d elegant  manner.  The  common  feature  of  these  languages 
is  provision  for  _ '.ract  data  types.  (We  will  also  use  the  Simula  term 
class. ) The  essence  of  the  class  is  that  it  allows  the  collection  of 
designated  class  data  and  class  rocedures  into  a single  unit  so  that  (in  a 
modification  of  Simula  to  be  described)  it  is  possible  to  ensure  that  the 
class  data  are  modified  only  by  the  class  procedures.  A Simula  class 
definition  defines  the  form  of  the  members  of  the  class — the  class  instances. 
Each  instance  may  be  viewed  as  an  abstract  sequential  machine:  the  valijes  of 
the  local  lata  af  the  class  give  its  atate;  the  class  definition  determines 
tiie  initial  state;  and  the  invocations  of  class  procedures  cause  state 
transitions.  For  example,  an  element  of  the  class  of  binary  search  trees 
might  nave  initial  internal  data  representing  an  empty  tree  and  procedures  to 
insert  new  nodes  and  return  the  contents  of  existing  nodes.  An  attractive 
feature  of  this  organization  is  that  the  user  manipulates  class  instances 
without  access  to  or  concern  about  their  internal  representation  (except  as 
provided  indirectly  through  the  class  procedures) . 

In  a hierarchy  of  class  instances  (also  known  as  abstract  machines) . the 
machines  form  the  nodes  of  a tree.  The  root  machine  solves  a particular 
problem  and  the  sons  of  each  node  are  machines  in  terms  of  which  the  parent 
machine  can  be  realised.  The  leaf  nodes  are  primitive  machines  that  are 
directly  realized  by  some  existing  hardware  or  programming  language 
implementation. 


We  have  developed  a method  [15,17]  that  exploits  the  properties  of  such  a 
hierarchy  to  enhance  provability.  A hierarchy  of  abstract  machines  is 
selected  that  solves  a particular  problem.  Instead  of  directly  implementing 
tne  machines,  we  first  formally  specify  them  as  Parnas  modules  [14,15].  Next, 
for  each  nonleaf  machine  M,  we  write  a representation  function  from  a subset 
of  the  Cartesian  product  of  the  state  spaces  of  M’s  sons  onto  the  state  space 
of  M.  Finally,  we  write  abstract  programs  that  implement  the  operations  of 
each  machine  by  suitable  calls  on  the  operations  of  its  sons.  It  is  then 
possible  to  prove  the  correctness  of  tne3e  programs  with  respect  to  the 
specifications  of  the  machine,  the  specifications  of  its  sons,  and  the 
corresponding  representation  function. 

The  proof  of  correctness  of  a complex  program  can  be  made  practical  by 
t:iis  approach.  The  hierarchy  can  be  structured  so  that  the  abstract  programs 
and  the  specifications  are  all  quite  concise.  Thus,  proving  that  a large 
program  satisfies  certain  complex  specifications  reduces  to  proving  that 
several  small  programs  satisfy  simpler  specifications. 

In  Section  2,  we  present  the  formalism  that  we  use  for  specification, 
implementation,  and  proof.  In  Section  3,  we  introduce  a programming  problem 
and  specify  a hierarchy  of  machines  that  gives  a solution.  This  problem, 
solved  by  Deutsch  [3],  is  to  efficiently  maintain  (i.e.,  create  and  access) 
lists  so  that  no  two  are  isomorphic.  In  Section  4,  we  give  representation 
functions  and  abstract  programs  for  this  hierarchy  and  sketch  part  of  the 
proof  of  correctness.  We  also  present  a modification  of  Simula  that  serves  as 
tne  basis  of  the  implementation.  We  use  Simula  in  a dual  role:  to  provide  the 


syntax  and  primitive  operations  thereby  an  abstract  machine  is  implemented  by 
ether  machines;  and  as  the  language  whose  capabilities  are  the  leaves  of  the 
hierarchy.  In  Section  3,  we  use  the  specifications  of  the  abstract  machines 
that  are  directly  visible  to  the  user  to  prove  some  results  relating  the 
functions  available  at  this  user  interface.  In  Section  6,  we  present  some 
c one  1 ud  i ng  remar ks . 


2.  DESIGN  AND  PROOF  METHOD 


We  view  program  construction  as  consisting  of  the  following  stages: 

(1)  Selecting  a hierarchy  of  abstract  machines. 

(2)  Specifying  each  machine  as  a Parnas  module.  (Note  that  a single 
ipecification  may  describe  each  of  a number  of  identically  behaving 
machines. ) 

(3)  Representing  the  states  of  each  nonleaf  machine  in  terms  of  the 
ttates  of  the  machines  directly  below  it. 

(4)  Implementing  the  in  it i al ization  and  operations  of  each  nonleaf 
machine  in  terms  of  the  operations  of  tne  machines  directly  below  it. 

(5)  Writing  each  implementation  as  an  executable  program  in  a suitable 
programming  language. 

(This  is  a convenient  way  to  present  a finished  result;  in  practice,  trying  to 
complete  the  process  or  to  prove  correctness  often  leads  to  revision  of  the 
original  design.) 

In  the  first  two  stages  we  design  a hierarchy  and  describe  each  of  its 
nodes  as  an  abstract  machine  consisting  of  a state  and  various  operations  that 
can  interrogate  or  change  the  state.  Following  [14],  the  state  is  defined  by 
a set  of  value-returning  functions  called  V-functions  and  the  state- changing 
operations  are  defined  by  functions  called  O-functions  and  0V- functions.  (0- 
functions  serve  only  to  change  the  state;  OV-functions  return  a value  as 
well. ) 


A V-f unction  specif icat ion  consists  of : 

An  ASSERT  section,  which  is  an  assertion  characterizing  the 
allowable  argument  values  in  terms  of  the  abstract  machine  state. 

An  INITIAL  section,  which  is  an  assertion  constraining  initial 
values  of  the  function.  (The  INITIAL  sections  of  a module,  taken 

together,  define  the  set  of  initial  states  of  the  abstract  machine.) 

An  0-  or  QV-function  specification  consists  of: 

An  ASSERT  .section  which  is  like  that  of  a V-function. 
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A set  of  expressions,  called  EFFECTS,  characterizing  the  changed 
state  due  to  a call  on  the  function  by  means  of  the  resulting  V-function 
values.  These  expressions  m y allow  several  successor  states  (just  as 
the  INITIAL  assertions  may  allow  several  initial  states),  but  our  proofs 
are  independent  of  such  ambiguities.  Nondeterminian  is  pernitted  because 
it  L 5 a natural  way  to  avoid  "over  specicifying" . 


V-funetions  ar^  either  primitive  or  derived.  The  val aes  of  the  primitive 
V-functions  of  a nodule  ietermine  its  state,  while  the  values  of  the  ierived 
7-functions  are  just  a function  of  this  state.  Thus,  instead  of  an  INITIAL 
section,  a derived  V-function  has  -a  DERIVED  section  that  gives  its  value  in 
terms  of  prinitive  V-functions. 

Finally,  an  OV-function  specification  is  just  like  an  0- function 
specification  except  that  its  EFFECTS  section  also  constrains  the  value  to  be 
returned.  Although  OV-functions  are  not  strictly  necessary,  their  use  makes 
specifications  more  concise  and  thus  we  use  them  freely. 

The  final  stage  of  program  construction,  in  this  formal  method,  is 
implementation  and,  if  desired,  proof  of  correctness.  Suppose  machine  M is  a 
node  in  the  hierarchy  whose  sons  are  the  machines  Ml,  ...,  Mt.  We  refer  to 
tne  functions  )f  these  machines  by  the  following  generic  nanes:  v for 

prinitive  V-functions;  dv  for  ierived  V-functions;  o far  O-functions;  and  ov 
foe  OV-functions.  A state  of  M is  then  an  interpretation  of  its  v's;  a state 

of  the  abstract  machine  consisting  of  Ml Mt  (which  we  will  call  N)  is  an 

interpretation  of  the  v's  of  all  of  the  Mi's. 

A f ormal  i^tementj^ion  of  M in  terms  of  N then  consists  of: 

(a)  A representation  function  rep  from  a subset  of  the  states  of  N onto 
the  states  of  M.  A state  sn  of  N that  is  intended  to  model  a state 
of  M will  be  in  the  domain  of  rep  and  rep(sn)  will  be  the  modeled 
State.  The  use  of  rep  is  an  aid  to  design  and  proof  because  it 
corresponds  to  the  intuitive  idea  of  composing  a data  structure  out 
of  simpler  structures  (cf.  [“5]). 

(b)  An  initialization  program  INIT  written  in  terms  of  N.  IN IT  should 
achieve  a state  of  N that  male Is  an  initial  state  of  M. 

(c)  A program  of  machine  N,  denoted  imoK f .<) . for  each  possible 
invocation  f(x)  of  a function  of  M (where  x i3  the  n- tuple  of 
arguments  [xl,  ...,  xn]).  If  f is  a v,  dv,  or  ov,  this  program  nust 
return  a result  denoted  jinpl*(  f .x) ) . 

Given  a formal  implementation,  we  wish  to  provide  the  facilities  of  M 
where  only  the  facilities  of  N are  primitively  available,  (Of  course  the 
process  may  be  repeated  to  implement  the  components  Mi  of  N in  terms  of 
simpler  abstract  machines.)  With  a formal  implementation,  the  operations  of  M 
are  obtained  as  follows: 

(1)  The  components  of  N are  initialized. 

(2)  INIT  is  executed. 


(3)  Then,  for  each  call  f(x)  of  a function  of  M,  impl(f .x)  is  executed 

and,  if  appropriate,  the  result  inapl * ( f .x)  is  returned. 

We  naturally  intend  that  a formal  implementation  be  correct.  It  3hould 
be  noted  that  i najor  objective  of  the  formalism  we  have  described  is  to 
increase  reliibility  of  th13  programs  that  result  by  partitioning  their 
construction  into  separate  stages  and  by  making  their  structure  clear  and 
explicit.  Thus  the  importance  of  formal  proof  for  establishing  correctness  is 
reduced.  For  the  same  reasons,  however,  if  proof  is  desired,  for  example,  for 
particularly  critical  modules  of  a system,  it  is  much  easier  to  achieve  than 
it  would  be  in  the  absence  of  careful  structuring.  Specifically,  suppose  tne 
functions  of  N are  available  (either  in  underlying  hardware  or  in  software) 
and  correctly  implemented.  Then  we  need  sufficient  conditions  on  INIT,  rep. 
impl . and  impl*  to  guarantee  that  the  implementations  of  the  functions  of  M 
satisfy  the  specifications  of  these  functions.  These  conditions  are  as 
follows: 

(Cl)  Suppose  that  T is  an  initial  state  of  N and  that  state  T'  results 
from  executing  IUIT  beginning  in  state  T.  Then  we  require  that  T'f 
domain (rep)  and  that  rep(T' ) is  an  initial  state  of  M. 

(C 2)  Cl  guarantees  that  the  states  of  M and  N are  initially  "in  step". 
It  is  also  necessary  that,  whenever  the  states  change,  they  remain 
in  step.  More  precisely  suppose  that  S=reo(T)  and  that  impl( f .x) . 
executed  in  N from  state  T,  can  yield  state  T'  of  N.  (Recall  the 
possibility  of  nondeterministic  specifications  for  f.)  Then  we 
require  that  rep(T' )=S' . where  S'  is  a possible  result  of  the 
invocation  f(x)  from  state  S in  M.  Figure  1 describes  this 
situation  by  showing  states  as  nodes  of  a diagram  and  state 
transitions  as  directed  arcs.  C2  asserts  that  this  diagram  is 
commutative. 

(C3)  If  S,  T,  and  T'  are  as  in  (C2)  and  f is  a v or  1v,  then 
rep(T)zrep(T’ ) . That  is,  we  do  not  allow  the  inplemeotation  of  a V- 
function  to  change  the  state  that  is  represented  in  the  high-level 
machine. 

(C4)  The  value  that  is  returned  by  a 7-function  is  implicit  in  the  state 
of  the  abstract  machine.  The  value  to  be  returned  by  an  OV-function 
i3  constrained  by  both  the  state  of  the  abstract  machine  and  the 
function's  specification.  In  both  cases,  these  constraints  must  be 
satisfied  by  the  actual  value  that  is  returned  by  the  corresponding 
abstract  program — inpl,(f .x) . Specifically,  if  S,  T,  S',  and  T'  are 
as  in  (C2)  then: 

(a)  impl*( v.x)  must  be  the  value  of  v on  argument  x in  S. 

(b)  The  statement  that  dv(x),  invoked  in  state  S,  returns 
impl*(dv.x) . must  be  consistent  with  the  derivation  of  dv  in  S. 

(c)  The  statement  that  ov(x),  invoked  in  state  S,  returns 
i.mpl*(ov.x) . must  be  consistent  with  the  specification  of  (the 
result  of)  ov. 


Note  that  proving  C1-C4  for  M and  N,  a pai''  of  adjacent  levels  in  a 
hierarchy,  demonstrates  that  the  correctness  of  the  implementation  of  N 
implies  the  correctness  of  the  implementation  of  M.  Thus,  if  we  carry  out 
such  proofs  for  all  pairs  of  adjacent  levels  in  a hierarchy,  it  follows  that 
the  correct  operation  of  the  lowest  levels  of  the  hierarchy  (hardware  or 
programming  language  system)  implies  the  correctness  of  the  entire 
implementation. 


3.  SPECIFICATION  FOR  MODULES  OF  THE  EXAMPLE 


Introduction 


In  this  section  we  present  a hierarchy  of  abstract  machines  to  solve  the 
problem  of  maintaining  a class  of  unique  lists — lists  such  that  no  two  are 
structurally  isomorphic.  Thus  the  attempt  to  create  one  of  these  lists  yields 
an  old  cell  if  a snitabLe  cell  exists.  [As  an  example  of  the  utility  of  such 
a facility,  suppose  we  save  the  property  3IMPLIFIE3-T0-ZER0  in  some  table 
under — as  key — the  address  of  toe  list  (SUBTRACT  x x).  If  we  subsequently 
independently  create  a conventional  list  of  the  same  form,  it  will  have  a 
different  address  and  the  property  will  not  be  retrieved.  But  if  both  are 
unique  then  their  addresses  will  be  the  same  so  that  the  property  can  be 
looked  up  successfully. ] Naturally,  we  want  the  check  of  existing  cells  to  be 
efficient.  We  use  a particularly  effective  method  due  to  Deutsch,  which  he 
introduced  in  his  verification  system  [3]  to  associate  properties  with 
arbitrary  symbolic  expressions. 

We  begin  by  explaining  our  notation.  First,  note  that  when  we  wish  to 
refer  to  both  an  old  and  a modified  machine  state,  we  write  ' \ (x)  to  denote 
trie  result  of  this  call  in  the  old  state  and  v(x)  to  denote  the  value  in  the 
new  state.  (He^e  x denotes  a general  argunent  n-tuple.)  We  omit  fron  EFFECT3 
sectioas  facts  about  the  new  state  that  were  true  in  the  former  state. 

Thus,  if  for  some  V-function  v and  argument  x,  there  is  no  reference  in 
an  EFFECTS  section  to  v(x),  it  is  implicit  that  v(x)  = 'v'(x). 

Each  function  can  have  input  arguments  and — for  V-  and  OV-functions — a 
3ingle  output  argument,  used  to  refer  to  the  value  returned.  (The  name  of 
this  result  follows  the  semi-colon  in  the  function  header.)  The  type  of  an 
argument  associated  with  a function  is  defined  by  the  letter  used  for  the 
argument  in  the  header  of  the  function  specification.  In  the  case  of  the  CONS 
module  the  following  types  and  their  defining  letters  are  used:  b:Boolean, 
c:list  cell,  and  x:any  (i.e.,  unrestricted) . Booleans  are  primitive  in  our 
language,  but  we  specify  the  semantics  of  list  cells.  (We  view  the  question 
cf  how  best  to  use  more  complex  data- types,  such  as  those  of  ELI  [13],  as 
vo~thy  af  study  bJt  do  not  address  it  here.)  For  0V-  and  derived  '.-functions 
tne  value  gi/an  tn  thw  output  argument,  as  a result  of  a call  on  the  function, 
is  generally  characterized  by  the  fornulas  in  EFFECTS  or  DERIVED. 
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Specification  of  CONS 


Clearly  a facility  is  needed  to  maintain  conventional  li3ts.  This  is 
specified  as  an  abstract  machine,  called  the  CONS  module,  that  provides  the 
following  primitive  list-processing  functions: 

consp(x;b) — returns  the  value  TRUE  if  x is  a previously  created  list. 

car(x;x1) — returns  the  head  of  x.  The  precondition  is  consp(x),  that  is, 
x is  a previously  created  list.  The  INITIAL  values  of  car  are 
unconstrained,  since  the  precondition  precludes  arguments  other  than 
existing  lists  and  there  are  no  such  values  initially. 

cdr(x;x1) — returns  the  tail  of  <.  The  preconditions  and  INITIAL  values 
are  the  sane  as  those  of  car. 

cons(x1 ,x2;c) — returns  a new  list  c,  whose  car  is  xl  and  whose  cdr  is  x2. 
Note  that  we  assume  an  infinite  supply  of  new  values  to  be  used  as  cons 
cells — a call  may  return  any  value  (such  as  the  address  of  a previously 
free  cell)  with  suitable  car,  cdr,  and  consp. 

equal ( xl ,x2; b) — returns  TRUE  if  and  only  if  xl  and  x2  have  the  same  list 
structure  and  identical  non-list  components.  Equal  is  derivable  from 
the  other  V-functions  (consp,  car,  and  cdr)  and  from  tlie  primitive 
function  "="  which  is  TRUE  only  when  its  arguments  are  identical. 

We  specify  this  module  as  having  primitive  V-functions  consp,  car,  and 
cdr;  OV-function  cons;  and  derived  V-function  equal.  The  formal  specification 
is: 


V-function: 

consp(x;b) 

Initial : 

b=FALSE 

V-function: 

car( x; xl ) 

Initial : 

TRUE 

Assert: 

consp( x) 

V-function: 

cdr(x; xl ) 

Initial : 

TRUE 

Assert: 

consp ( x) 

OV-function: 

cons( xl ,x2; c) 

Assert: 

TRUE 

Effects: 

~'con3p' (c) 
consp(c) 
car(c)=x1 
cdr(c)=x2 
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V-function:  ^thK  xl  ,x?;  b) 

Assert:  TRUE 

Derived:  if  x1=x2  then  b=TRUE  el3e 

if  consp(xl)  and  consp(x2) 
then  b= (equal(car(x1 ) ,a 
and  equal ( cdr () 
else  b=FAL3E 

Observe  that  if  a call  of  equal  satisfies  it: 
guarantees  that  the  arguments  to  any  r< 
assertions. 

These  specifications  precisely  describe 
processing  operations:  car,  cdr,  cons,  cor 
LISP),  = (which  is  called  "eq"  in  LISP),  ar 
original  problem,  however,  since  cons  alwaj 
though  the  car  and  cdr  of  that  cell  a^e  id 
cell.  It  is  clear  that  we  need  additional  me 
not  happen.  Tous  we  will  specify  search  and 
keep  track  of  existing  list  cells. 


Specification  of  HASH-CONS 


The  HASH-CONS  module  is  similar  to  the  C 
in  its  list-creating  mechanism.  It  is  specif 

V-function:  hconsp(x;b) 

Initial:  b=FAUSE 


V-function : 
Initial: 
Assert: 

V-function: 
Initial : 
Assert: 


hcar(x;x1 ) 
TRUE 

hconsp( x) 

hcdr( x; xl ) 
TRJE 

hconsp(x) 


OV-function:  hcons(x1 ,x?;c) 

Assert:  TRUE 

Effects:  Let  A = {cl  : 'hconsp'(cl) 

and  'hear' (cl )=x1 
If  Ai{ } then  c e A 

else  ~ 'hconsp' (c) 
hconsp(c) 
hcar(c)=x1 
hcdr(c)=x2 

In  the  EFFECTS  of  hcons,  the  set  A of  exi3ti 
will,  in  fact,  always  have  at  most  one  nenber 
Section  5 , Theor^n  i below)  aid  thus  we  use 
the  specification  to  avoid  the  appearance  of 
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The  functions  hconsp,  hear,  a nd  hedr  are,  respectively,  like  c 
and  edr  in  CONS.  The  function  hconsp  retiirns  TRUE  if  its  argument  i 
cons'ed  list;  hear  arid  hedr  return,  respectively,  the  head  and  tai 
lists.  The  OV-function  hcons,  applied  to  arguments  xl  and  x2,  retur 
cons'ed  list  with  hear  xl  and  hedr  x2 — but  this  list  is  new  o 
hcons'ed  list  already  exists  with  these  components.  The  reader  ma 
the  absence  of  an  hequal,  analogous  to  equal  in  th*  CONS  module,  b 
function  is  superfluous  as  we  prove  below. 

In  Section  6 we  will  prove  several  results  about  the  CONS  and 
machines  is  they  ippear  to  a top-level  user.  These  proofs  are  based 
the  specifications  if  CONS  and  HASH-CONS  and  the  reader  riiy  prefer  t 
then  now,  deferring  tne  presentation  ini  pr’oof  of  the  -'est  of  the 
which  is  the  subject  of  the  renainder  of  tnis  section,  Section  4,  a 
5. 


Specification  of  SEARCH 


Our  solution  requires  that  all  lists  of  the  HASH-CONS 
maintained  in  a data  base  which  can  be  (preferably  rapidly)  sea 
associative  search,  provided  by  our  SEARCH  module,  is  a convenient  a 
for  this  purpose: 

V-function:  get(x1;x) 

Initial : x=N0NE 

OV-function:  save( xl ,x2; x) 

Effects:  if  'get' (xl )=NONE 

then  x=get( xl )=x2 
else  x=’get’ (xl ) 


Arbitrary  values  arc  saved  and  retrieved  by  means  of  a key 
function  get  does  the  retrieval  — its  argument  is  the  key.  Ini 
returns  the  dieting iishel  default  value  NONE.  The  OV-function  save, 
arguments  xl  and  <?,  associates  the  value  x2  with  the  key  xl . Howe 
value  has  been  previously  saved  with  key  xl,  then  that  value  [g 
returned  and  the  data  base  is  not  changed.  This  dual  U3e  of  save  i 
because,  in  our  application,  once  an  entry  is  associated  with  a 
key,  that  correspondence  never  changes. 


Specification  of  VARIABLE 


Finally  we  need  a VARIABLE  module  to  maintain  a single  value, 
the  distinguished  value  NONE.  Its  specification  is: 

V-function:  load(;x) 

Initial:  x=N0NE 


-t 
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O-function:  store(x) 

Effects:  load()=x 

The  V-function  load  returns  the  saved  value  and  the  O-function 
makes  x the  saved  value.  (This  very  simple  module  serves  a m 
purpose  in  the  hierarchy:  we  implement  SEARCH  in  te^ms  of  CONS  a 
Section  4,  but  could  not  implenent  it  with  CONS  alone.) 


Summary 


We  can  now  describe  the  organization  of  these  four  module: 
trie  unique  list  maintenance  problem.  It  is  given  in  "igure  2.  ' 

HASH-CONS  to  create  unique  lists.  He  can  also  create  conventic 
calling  the  (highest  level)  CONS  machine.  'LASH-CONS  is  imp] 
CONS,  SEARCH  and  VARIABLE.  As  we  explain  in  the  next  section, 
nachines  are  used  in  the  implementation,  correspond ing  to  th* 
between  HASH-CONS  and  SEARCH  in  Figure  2.  Finally,  each  SEARC 
implemented  by  a VARIABLE  machine  and  a CONS  machine. 


4.  FORMAL  IMPLEMENTATIONS 


Introduction 


In  this  section  we  present  formal  implementations  for 
specified  in  Section  3-  ’Hiere  are  four  such  implementations: 
terms  of  CONS,  SEARCH,  and  VARIABLE;  SE.ARCH  in  terms  of  CONS 
CONS  in  terms  of  Simula;  and  VARIABLE  in  terms  of  Simula.  We  als 
proof  of  correctness  of  the  HASH-CONS  implementation. 


Fp-nil  Implementation  and  Pqqof  o£  HASH^CONS 


First  we  describe  the  implementation  of  HASH-CONS.  The  stat 
CONS  virtual  nachLoe  Ml  is  defined  by  interpretations  of  the 
functions  hconsp,  hear,  and  hedr.  We  wish  to  represent  such  a st 
of  a state  of  a virtual  machine  M2  for  CONS,  SEARCH,  and  VARIABLE 
is  defined  by  the  interpretations  of  the  primitive  V-functions  cc 
edr  (of  CONS),  get  (of  SEARCH),  and  load  (of  VARIABLE).  Thus, 
description  of  a formal  implementation  given  in  Section  2,  we  mu 
INIT  program  on  M2,  a rep  function  from  some  subset  of  the  states 
states  of  Ml,  and  implementations  (by  means  of  programs  on  M2) 
functions  of  Ml . 

To  remember  the  set  of  known  list  cells,  we  use  multiple  SE1A 
Each  Is  a separate  abstract  machine  whose  state  Ls  independent  of 
tne  others.  We  adopt  the  notation  of  Simula  and  write  "s.f(x)"  t 
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call  of  function  f of  nachine  3 on  argument  x.  The  sot  of 
consists  of  a distinguished  nachine  and  a dynamically  varying 
nachines.  The  distinguished  machine  (recorded  in  VARIABU 
correspondence  between  forms  and  the  secondary  machines 
machines  map  from  forms  to  forms.  In  particular,  if  c is  ai 
then  a retrieval  from  the  distinguished  SEARCH,  with  key  hi 
SEARCH  table  s such  that  retrieval  from  s with  key  hcar(c)  ; 
two-level  retrieval  system  is  illustrated  in  Figure  3. 
seventh  list  (a  (b  c)  c)  has  the  sixth  list  ((b  c)  c)  as  it3  I 
with  key  6 in  the  prinary  SEARCH  table  gives  a secondary  tabl 
key  a — time  hear  of  7 — we  find  7.]  Furthermore,  only  hcons'ed 
retrievable  from  this  two- level  structure.  We  make  this  re 
by  defining  the  domain  of  rep  as  the  s^t  or  l owe'*  level  state 

(Wc,xa ,xb) ( load( ) .get( xb)=NONE 

or  ( load( ) .get( xb) .get( xa)=ciNONE 
s (consp(x)  and  car(c)=xa 

and  cdr(c)=xb))) 

The  cells  c .satisfying  this  formula  are  exactly  the  lower-lev 
HASH-CONS.  Although  the  other  lower-level  cells  are  not 
existence  does  not  affect  the  correctness  of  the  implementat 
known  cells  is: 

3)  = {c  : load( ) .get(cdr-(c) ) .get(car(c)  i=c} . 


Let  T be  a state  of  M2  in  domain ( rep) , defined  by  its 
get,  and  load.  We  specify  the  represented  state  rep(T)  by  i 

hedr,  and  hconsp: 

hear  = car,  restricted  to  the  subdomain  3D; 

hedr  = edr,  restricted  to  the  subdoaain  SO;  and 

hconsp(x)  = consp(x) 

and  loadO  .get(cdr(  x)  )iNuNE 

and  load( ) .get(  :•  lr(  x) ) ,get(car(  x)  ) = x. 

Thus  hconsp  is  TRUE  precisely  for  elements  of  SD. 

For  IV1T  we  want  3 prog  ran  that  saves  in  VARIABLE  a n 
initialized  SEARCH  machine.  Adapting  Simula’s  notation,  we  w 

store (new  SEARCH). 

Following  the  execution  of  IN1T,  there  are  no  hcons'ed  cells. 

Finally,  we  implement  the  functions  of  Ml.  hconsp(x)  is 

consp(x)  and  load( ) .get(cdr(x))iNONE 

and  load( ) .get(cdr(x)) .get(car(x))=x; 

hcar(x)  is  implemented  by: 
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nsert  hconsp(x); 

car(x); 

hcdr(x)  is  implemented  by: 

assert  hconsp(x); 

cdr(x); 

and  hcons(xa.xb)  is  implemented  by: 

load( ) .save( xb,  new  SEARCH) .save(xa , cons(xa ,xb) ) . 

The  expression  e=load( ) .save( xb,  new  SEARCH)  yields  the  secondary  SEARCH  table 
(either  existing  or  newly  created)  in  which  the  cell  we  seek  should  be 
recorded.  Thus  e.save(xa,  cons(xa,xb))  finds  that  cell  if  it  exists,  or, 
failing  that,  recoris  a suitable  new  cell  in  the  table.  Although  this  program 
nay  appear  to  be  profligate  in  its  creation  of  list  cells  and  SEARCH  tables, 
in  fact  the  specification  of  save  guarantees  that  previously  stored  entries 
are  not  overwritten,  and  the  executable  Simula  version  of  save,  given  below, 
will  be  such  that  the  second  argument  is  not  evaluated  unless  it  is  needed. 

We  now  prove  the  correctness  of  this  formal  implementation.  As  we 
explained  in  Section  2,  the  proof  requires  verifying  four  conditions  Cl,  C2, 
C3.  and  C4  as  follows: 

(Cl)  INIT  is  ''store( new  SEARCH)".  Its  execution  from  the  initial  state  T 
of  M2  yields  a state  T'  in  which  l.adO  is  a new  SEARCH  and  hence  such  that, 
for  any  x,  load( ) ,get(x)=NONE.  Clearly  T'  is  in  domain( rep)  and  rep(T' ) is  a 
state  of  Ml  in  which  the  domains  of  hear,  hedr,  and  hconsp  are  all  empty,  that 
is,  the  initial  state  of  Ml  as  require!. 

(C2)  T e only  state-changing  function  is  hcons.  Let  impKhcons.  [xa,xb]) 
be  executed  from  a state  T in  domain( rep) . yielding  a new  state  T* . If 
' load' (). 'get' (xb)tNUNE  and  ' load' (). 'get' (xb) . 'get* (xa)iNONE,  then  the 
implementation  causes  no  state  change  in  M2.  Moreover,  from  the  definition  of 
rep,  we  deduce  that  a list  cell  c exists  such  that,  in  state  S of  Ml, 
'hconsp' (c) , 'car'(c)=xa,  and  'cdr'(c)=xb.  Hence  there  is  no  state  change  in 
Ml  or  M2  and  the  commutativity  of  the  diagram  shown  in  Figure  1 is  immediate. 
On  the  other  hand,  if  either  call  of  get  does  return  NONE,  then  there  is  a 
state  change  in  M2.  However,  it  is  then  clear  from  the  specifications  and  rep 
that  no  suitable  HASH-CONS  list  cell  existed  in  the  old  state  S of  the  Ml 
(i.e.,  a cell  with  hoae  xa  and  heir  xb)  and,  moreover,  that  such  a cell  does 
exl3t  in  tha  new  state  S'.  In  thl3  case  it  can  be  seen  that  the 
i nplementation  of  hcons  changes  T exactly  by  creating  a list  cell  c such  that 
load( ) .get( fb)  .get(  fa)=c.  Hence  roo(T)  is  S'  and  tine  diagram  does  commute. 

( C3 ) It  must  be  shown  that  the  V-function  implementations  do  not  change 
the  represented  state  of  Ml.  But  thi3  is  ’■rlvial  since  the  implementing 
programs  do  not  call  any  O-functions  of  M2  and  hence  cannot  cause  any  state 
change  in  M2. 

(C4)  Here  it  is  necessary  to  show  that  the  values  returned  by  the 
implementing  programs  on  M2  agree  with  the  specifications  of  Ml.  This  is 
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immediate  since  the  same  list  cells  are  employed  at  both  levels  to  represent 
HASH-CONS  lists  and  since  th*1  values  and  domains  of  car,  cdr,  and  consp  are 
updated  exactly  in  parallel  with  those  of  their  counterparts  hear,  hedr,  and 
hconsp. 

This  completes  the  proof  af  the  HASH-CONS  implementation. 


Formal  Implementation  of  SEARCH 

Next  we  describe  a formal  implementation  of  SEARCH.  The  implementation 
by  Oeutsch  actually  uses  two  J if Parent  implementations:  a conventional 

hashtable  for  the  primary  SEARCH  table,  and  LISP  assoc  lists  (i.e.,  lists  of 
cons  cells)  for  the  subordinate  tables.  In  this  paper  we  present  only  the 
second  of  these  for  both  purposes;  the  use  of  both  is  discussed  in  the  last- 
part  of  this  section. 

The  state  of  a virtual  machine  M3  for  SEARCH  is  defined  by  its 
interpretation  of  the  primitive  get.  We  wish  to  implement  this  machine  by 
means  of  a machine  M4  with  the  facilities  of  CONS  and  VARIABLE.  The  central 
point  of  this  implementation  will  be  to  use  LISP  assoc  lists  to  represent  the 
map  from  keys  to  values.  Thus  we  define  the  predicate  wf f-assoc-list , which 
tests  whether  an  assoc  list  is  well-fornai,  by: 

wf f-assoc-list(x)  = 

x=NONE  or  (co.nsp(x)  and  consp(car(x)) 

and  wf f-as30c-list(cdr(x) ) ) . 

The  donain  of  rep  is  now  the  set  of  states  of  M4  such  that: 
wf f-assoc-list ( load ( ) ) 

That  is,  the  represented  states  are  those  such  that  the  value  in  VARIABLE  is  a 
well-formed  assoc  list.  For  a state  T an  this  domain,  we  define  the 
represented  state  of  M3  to  be  the  state  such  that,  for  any  xl: 

get(xl)  = getx(x1,load()) 

where  the  definition  of  getx  Is: 

getx(xl.x)  = 

if  x=NONE  then  NONE  else 
if  car(car( x) )=x1  then  cdr(cartx)) 

else  gefx(x1 ,cdr(x)) . 


The  specified  initial  state  of  M4  correctly  models  the  initial  state  of 
M3;  hence  INIT  Is  the  empty  program. 

Finally,  we  implement  the  two  functions  of  M3.  The  OV-function  3ave  is 
implemented  by: 
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save(x1,x2)  = 

if  getx( xl , load( ) )=NONE 

then  store(cons(cons(x1  ,x.?)  ,loud(  ) ) ); 

&etx( xl ,load( ) ) ; 

and  the  V-funotion  get  is  implemented  by: 
get(xl)  = getx(x1 , load( ) ) . 

Thus  we  retrieve  values  by  considering  in  orler  the  elements  of  the  linked 
list  in  which  k«ys  and  values  are  stored. 


Extensions  of  Sijiula 


We  want  to  ieseribe  the  implementations  >f  CONS  and  VARIABLE  in  terns  of 
an  underlying  Simula  machine.  For  the  implementations  described  above,  we 
have  been  able  to  refer  to  formal  specifications  for  the  abstract  machines 
involved;  here  we  must  instead  rely  on  informal  descriptions  of  Simula,  since 
a formal  specification  of  the  language  is  outside  the  scope  of  this  paper. 
(The  reader  should  consult  [22]  for  an  example  of  axiomatic  specification  of  a 
programming  language.)  It  is  important  to  note  that  the  correctness  of  the 
complete  implementation  depends  upon  the  correctness  of  the  underlying  Simula 
system. 

Simula  is  essentially  an  extension  of  Algol  60.  The  primary  addition  is 
the  class  construction.  Tne  definition  of  a class  C is  syntactically  sLmilar 
to  an  Algol  block  and  has  its  own  local  variables  and  local  procedures.  It  is 
activated  by  tne  "'all  "new  C"  whose  value  is  a new  instance  of  the  class;  this 
value,  x,  satisfies  the  assertion  "x  is  C".  The  significance  of  the  class 
construct  is  that,  unlike  Algol,  where  the  completion  of  a block  or  procedure 
activation  results  in  the  discarding  of  the  local  environment  of  the 
activation,  this  environment  survives  t.he  initial  activation  in  Simula  and 
thus  remains  accessible  by  means  of  the  "dot"  notation  used  above.  (Thus 
class  instances  may  be  used  to  model  the  abstract  machines  discussed  above.) 
The  interested  reader  should  consult  [1,2]  for  descriptions  of  standard 
Simula;  we  concentrate  her-e  on  a number  of  molif ications  that  we  have  adopted. 

We  begin  with  the  assignment  and  equality  operators.  Although  Simula 
distinguishes  pointer  (or  refer  *ovjej  values  from  nonpointer  values  and  uses 
different  assignment  operators  for  the  two  types,  we  omit  this  iistinction  and 
employ  the  single  operator  * :='  for  both.  The  ba3io  equality  test,  denoted  by 
tie  infix  =,  should  be  taken  to  mean  that  its  arguments  are  identical.  For 
primitive  objects  (e.g.,  integers  or  Boolean3)  this  is  the  conventional 
notion.  For  two  compound  objects  (e.g.,  arrays),  this  implies  that  both 
arguments  of  the  test  refer  to  the  .same  object — identical  contents  do  not 
suffice.  And  for  pointers  (which  in  Simula  include  all  class  instances),  the 
same  object  must  bo  pointed  to  for  the  test  to  succeed.  Thus  for  the  purpose 
of  proof,  = is  an  identity  relation:  it  is  an  equivalence  relation  such  that 
equal  expressions  may  be  substituted  for  each  other.  (Note  that  this  is  not 
the  case  for  tl>e  derive  predicat  1 "equal"  ’.n  forms,  though  it  will  turn  out 
to  hold  for  the  correspond ing  prejA  n f >ns  of  HASH-CONS.) 
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The  next  set  of  changes  to  Sinai 3 involve  declarations  and  procedure 
calls.  We  augnent  declarations  ay  allowing  an  initial  value  to  be  specified 
witn  t.n.j  key  word  initial ; if  this  option  is  not  used,  then  the  initial  value 
is  the  default  NONE.  Actual  parameters  to  procedures  are  called  by  first  U3e 
as  suggested  by  [11]  (where  the  rule  is  denoted  normal  eval lation) ; the  actual 
parameter  expression  is  not  evaluated  until  needed  in  the  procedure  body  and 
the  value  obtained  at  this  first  use  is  used  at  subsequent  references  without 
reevaluating  the  expression.  (This  differs  from  call-bv-value  where  the 
expression  is  always  evaluated  exactly  once  and  from  call- by-name  'where  it  is 
evaluated  as  often  as  it  is  referred  to;  with  call-bv-first-use  the  expression 
is  evaluated  at  most  once.)  We  allow  procedure  definitions  to  be  augmented  by 
assert,  statements  that  provide  Boolean  preconditions  for  a call  of  the 
procedure.  In  an  inplenentation  these  preconditions  might  be  formally  proved 
for  all  calls  prior  to  compilation,  tested  dynamically  at  run  time,  yielding 
exceptions  when  they  failed,  or  ignored  (at  the  user’s  peril).  Finally,  for 
class  activations,  we  sometimes  write  ”c(...)"  instead  of  ” new  c(...)"  for 
conciseness. 

Our  la3t  set  of  changes  relates  to  the  protection  facilities  of  Simula. 
The  value  of  Simula  for  our  application  derives  from  the  possibility  of  using 
a class  instance  to  represent  an  abstract  machine  with  accesses  to  the 
environment  of  the  instance  corresponding  to  interrogation  or  modification  of 
the  abstract  machine  state.  Unfortunately,  in  standard  Simula,  access  to 
tnese  envirorments  is  unrestricted  and  thus  it  is  difficult  to  capture  the 
semantics  of  abstract  machines  such  as  those  of  Section  3.  Several  authors 

have  suggested  protection  features  for  Sinula  to  remedy  thi3  problem  [3,13]; 

we  adopt  an  analgin  of  these  suggestions. 

Tne  object  is  to  control  access  to  the  locals  of  class  instances.  We 
achieve  this  by  forbidding  access  unless  it  is  explicitly  authorized  by  the 
class  definition.  If  a local  procedure  p of  a class  is  declared  public  then 
it  may  be  invoked,  using  the  "dot"  notation  x.p(arg1,  ...,  argn) , on  any 
instance  x that  is  available  at  the  point  of  invocation.  Other  local 
procedures  of  the  class  are  considered  hidden  and  are  thus  only  callable  from 
within  the  class  body.  Similar  restrictions  apply  to  local  data  of  a class: 
writable  locals  may  be  arbitrarily  accessed;  readable  locals  may  be  read  but 
not  changed;  and  other  locals  may  not  be  accessed  outside  the  class  body. 

These  protection  facilities  will  suffice  for  the  present  paper;  it  is 
interesting  to  note,  however,  a lore  flexible  facility  that  is  needed  in 

general.  Suppose  a hierarchy  jf  classe 5 is  constructed  so  that  class  cl 

create  local  instances  if  c2  and  occasionally  returns  these  instances  as  the 
result  of  public  procedure  calls.  Ts  implenent  a desired  data  abstraction, 
one  night  wint  to  allow  the  public  procedures  of  the  local  c2  instances  to  be 
invocable  within  el  hut  not  by  th*3  creator  of  cl.  (This  restriction  would  be 
necessary  in  our  examples  if  existing  CONS  lists  were  modifiable  by 
destructive  operations  such  us  the  LISP  rplaca  and  rplacd  [10];  it  would  then 
be  important  to  guarantee  that  a U3er  of  HASH-CONS  could  not  modify  lists 
represented  by  CONS  lists,  even  though  such  modification  was  possible  by  the 
original  creator  of  a CONS  list.)  Alphard  provides  a suitable  mechanism  for 
implementing  this  protection  [20];  the  reader  should  consult  [12]  for  an 
extensive  analysis  of  related  issues. 
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Implementations  of  CONS  and  VARIABLE 

I 

Using  Simula,  thus  modified,  we  can  implement  CONS  and  render  the 
implenentations  of  SEARCH  and  HASH-CONS  given  above  by  executable  programs. 
First,  COr^S  is  implemented  by  declaring: 

class  cons(car,  edr);  readable  any  car,  edr; 
comment  The  body  of  cons  is  null; 

(Note  the  underscoring  of  reserved  words.)  The  parameters  of  the  definition 
allow  initial  valiies  to  be  supplied  when  a new  instance  is  created.  Instances 
of  the  class  cons  then  represent  the  individual  cons  cells  of  the 
specification;  the  OV-function  call  "cons(xa,  xb)"  is  implemented  by  the 
Simula  statement  ''new  cons(xa,  xb)";  for  a cons  instance  c,  the  V-function 
calls  "car(c)"  and  "cdr(c)"  are  implemented,  respectively,  as  "c.car"  and 
"c.cdr";  and  the  V-function  call  "consp(x)"  is  implemented  by  the  Simula 
expression  "x  _is  cons".  Note  that  we  use  the  set  of  existing  instances  of  the 
class  cons  to  realize  the  CONS  module  specification.  By  contrast,  the 
programs  given  below  will  realize  HASH-CONS  and  SEARCH  abstract  machines  by 
single  instances  of  corresponding  classes.  (Since,  in  practice,  there  will  be 
only  a single  underlying  Simula  machine,  the  implementation  of  con3p  just 
given  is  not  strictly  correct.  The  problem  is  that  the  Simula  state  contains 
three  kinds  of  cons  instances:  those  created  explicitly  by  the  user;  those 

created  by  the  implementation  of  HASH-CONS;  and  those  created  by  the 
implementation  of  SEARCH.  The  implementation  "x  is  cons"  for  "consp(x)" 
confuses  these  three.  This  technical  difficulty  may  be  resolved  by  defining 
identical  but  differently  named  classes — consl,  cons2,  and  con3 — for  the  three 
uses.  However,  this  would  complicate  the  presentation  unnecessarily  and  thus 
we  retain  the  single  cons.) 

The  VARIABLE  module  does  not  require  a special  Simula  implementation — its 
facilities  are  primitive  in  the  language,  provided  by  declared  variables  of 
suitable  scope. 

SEARCH  afld  HASH-CONS  K_evi3ited 


Ihe  SEARCH  implementation  gi/en  above  nay  be  straightforwardly  translated 
to  the  3i nula  class  declaration: 
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glass  assocer; 
te&in 

40 1 x; 

any  public  procedure  save( x 1 , x2 ) ; any  x 1 , x2 ; 

bg&ip 

save  : = save2(x1 , x); 

If  save=N0NE  then 

begin  x :=  cons(cons( xl , x2),  x);  save  :=  x2  end 

4hd; 

any  public  procedure  get(xl);  any  xl; 

get  :=  save2(x1 , ' x) ; 

any  procedure  save2(xl,  x);  any  xl,  x; 

Save2  : = jX  x=N0NE  then  NONE  else 

if  x.car.car=x1  then  x.car.cdr 
else  save2(x1,  x.cdr) 

end; 

whose  local  variable  x realizes  the  VARIABLE  of  the  formal  implementation  and 
whose  public  procedures  are  the  required  0-  and  V-functions  of  the  module. 

An  important  advantage  of  the  iesign  method  presented  here  is  that  the 
'orrectness  of  the  implementation  of  HASH-CONS  is  independent  of  the 
pirticular  implementations  'if  lower- level  modules,  so  long  as  those  modules 
satisfy  their  specifications.  This  fact  can  be  applied  in  our  example  by 
using  two  different  implementations  of  SEARCH:  a conventional  hashtable  with 
open  searching  for  th*>  listinguished  SEARCH  table,  and  the  assoc  list  version 
just  given  for  the  others.  For  a correct  hashtable  implementation,  the  reader 
is  referred  to  [19];  here  we  omit  the  details  and  give  only  the  structure  of 
the  implementation  (which  does  satisfy  the  SEARCH  specification).  This  is  as 
follows : 

class  hasher; 
begin 

assocer  public  procedure  save(x1,x2); 

>nv  x 1 ; assocer  x2 ; 

assocer  public  procedure  get(xl);  any  xl; 


end: 


Finally,  HASH-CONS  is  realized  by  another  straightforward  translation  to 
Simula  of  a formal  implementation,  balding  on  the  previously  defined  classes 
hasher,  assocer,  *nd  cons.  The  required  definition  is: 
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class  hash-conser; 

{JSill 

hasher  a initial  hasher(); 

*nv  public  procedure  hoar(x);  cons  x; 
hear  :=  x.car; 

any  public  procedure  hcdr(x);  cons  x; 
hedr  :=  x.cdr; 

any  public  nr^qgdurg  hcons(x1,  x2);  ^ray  xl , x2; 

heems  :=  a.save(x2,  assocer( ) ) .save(  xl , cons(x1,  x2)); 
Boolean  public  procedure  hconsp(x);  any  x; 

te&in 

any  temp; 

beonsp  :=  x is  cons 

and  (temp  :=  a.get(x.cdr) )*NONE 
and  x=temp.get( x.car) 
end 
gnd; 


Note  that  oall-by-first-use  assures  that  the  second  arguments  to  save  are 
evaluated  only  if  they  are  needed. 


5.  SOME  RESULTS  ABOUT  HASH-CONS  AND  CONS 


Consider  the  interface  available  to  a top-level  user,  as  shown  in  Figure 
2.  A HASH-CONS  machine  provides  unique  li3t-prooessing  and  a CONS  machine 
provides  conventional  list-processing.  We  wish  to  prove  some  properties  of 
the  lists  t'rut  are  obtained  by  the  user  of  these  nachines.  The  first  of  these 
may  be  viewed  as  part  of  the  assertion  that  HASH-CONS  does  indeed  provide 
unique  li.st3.  That  is,  if  two  unique  lists  have  the  3ame  hear  and  hedr,  then 
they  are  identical.  More  precisely,  we  have: 

Theorem  (Ucl ) (Wc2)hconsp(c1 ) and  hconsp(c2) 
and  hear ( c 1 ) = hear ( c2 ) 
and  hcdr(c1 )=hcdr(c2) 

->  cl  =c2. 


Proof.  The  theorem  will  be  proved  by  induction  on  sequences  of  states  of 
HASH-CONS.  This  method  of  proving  properties  of  modules  or  protected  classes, 
which  we  call  generator  induction . is  discussed  in  [5,19|21].  We  must  prove 
the  tneorem  for  each  initi if  state  of  HASH-CONS  and  for  each  state  S'  such 
that  the  theorem  holds  in  some  immediately  preceding  state  S.  So  that  it  i3 
clear,  in  each  of  the  formulas  that  follows,  which  state  is  Involved,  we  will 
concatenate  stite  names  and  function  names.  For  example,  hcarS(c)  denotes  the 
hoar  of  o in  state  S. 

Tne  basis  of  tne  induction  is  immediate,  since  hconspl(c)  is  FALSE  for 
any  o and  initial  state  I.  Thus  it  suffices  to  assune  the  theorem  holds  in 
30, me  state  3 and  to  deduce  it3  validity  in  3 successor  state  S'  that  results 
from  the  operation  hcons(x1,x2)  (since  hcons  is  the  machine's  only  state- 
changing  operation).  .Suppose  that  cl  and  c2  are  lists  such  that  hconspS*  (cl ) , 
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hconspS' (c2) , hcarS* (cl )=hcarS' (c2) , and  hcdrS' (cl )=hcdrS' (c2) . We  must  show 
that  c 1 -c2 . This  is  an  immediate  consequence  of  the  inductive  assumption  if 
hconspS(cl)  and  hconspS(c2)  both  hold  so  we  may  assume  that  either 

“hconspS(cl)  or  ~hconspS(c2) . Without  loss  of  generality,  assume  that 
~hconspS(cl ) . Then,  according  to  the  specification  of  hcon3(c1,  c2)  and  the 
fact  that  hconspS' (cl ) , it  must  be  the  case  that  cl =hcons3(x1 , x2).  We  now 
consider  two  cases:  ~hconspS(c2)  or  hconsp3(c2). 

In  the  ~hcon3pS(c2) , It  follows  as  above  from  the  definition  of 

hconsS(xl,  <?.)  and  the  fact  that  nconspS' (cl ) , that  c?=hcons3(x1 , x?),  and 
hence  that  cl =c2  as  required . 

In  the  second  case,  hconso3(c2)  together  with  the  fact3  that 
hcarS' (c2)=hcar3' (cl )=x1  and  hcdrS' (c2)=hcdr3' (cl )=x2,  imply  by  the 
specification  of  neons  that  S'=S.  Consequently,  the  result  that  c1=c2  is 
immediate  by  the  inductive  assumption  and  the  proof  is  complete. 

Next,  we  extend  this  result  to  show  that  equal  is  not  needed  for  HASH- 
CONS  lists  because  it3  function  is  served  by  In  the  proof,  we  will  need 

a function  hequal  for  HASH-CONS,  which  we  have  not  yet  defined;  it  is  defined 
by  analogy  with  CONS  equal  in  the  obvious  way. 

Theorem  2.  (Wxl , x2)[ hequal (xl , x2)  ->  f 1 = f 2 ] 


Proof.  Note  that  only  a 3ingle  state  occurs  in  this  theorem  and  proof, 
so  the  special  notation  of  Theorem  1 i3  not  needed.  If  “hconsp(xl),  then  the 
result  is  » st^a ightforward  consequence  of  the  i°finition  of  hequal.  To  prove 
the  result  when  hconsp(xl),  note  that  hequaKxl,  x?)  implies  that  hconsp(<2). 
We  proceed  by  structural  induction  [21],  assuming  the  result  true  for  hoar(xl) 
and  for  hcdr(xl)  (and  all  x2).  Then  hequaKxl,  x.2)  Implies  that 

'iequal(hcar(xl ) , hca^ix?))  and  hequal ( hcdr( xl ) , hcdr(x2));  hence  by  the 

inductive  assumption,  hcar(x1 )=hcar( x2)  and  hcdr(xl ) = hedr(x2) . Theorem  1 now 
applies  and  yielis  the  iesired  result. 

Next  we  prove  a theorem  about  a program  that  might  be  run  by  a top-level 
U3er  of  these  machines  user  of  these  machines  to  translate  conventional  lists 
to  unique  lists.  We  claim  that  this  may  be  done  by  the  function  hcopy  defined 
as  follows: 

hcopy(x)  :=  if  consp(x) 

then  hcons(hcopy(car(x) ) , hcopy(cdr(x) ) ) ; 

•lie  x; 

The  major  result  about  hcopy  is  that  if  "equal"  holds  for  any  two  forms,  then 
"="  holds  for  their  hcopies.  To  state  this  precisely,  we  define  correct- 
hcopy(xa,xb)  to  be: 

equal(xa.xb)  ->  hcopy(xa)=hcopy(xb) . 

We  can  then  prove: 

Theoren  J.  (Uxa ,xb)[correct-hcopy( xa ,xb) ] . 
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This  formula  nay  appear  ill- formed , since  the  effects  and  result  of  hcopy 
are  contingent  upon-- and  also  modify — the  stats  of  HASH-CONS.  A more  precise 
statement  La  as  follows.  Suppose  equal ( xa ,xb) . Suppose  that  hcopy  is  applied 
to  xa,  beginning  in  sone  HASH-CONS  state  31.  This  application  terminates 
(since  our  lists  are  acyclic)  yielding  a value  xa*  and  a state  S2.  Suppose, 
moreover,  that  S3  is  any  successor  of  S2,  resulting  from  a series  of  state 
transitions  of  HASH-CONS,  starting  at  S2.  Finally,  suppose  that  hcopy, 

applied  to  xb  from  state  33,  yields  value  xb'  and  state  S4.  Then  xa'=xb' . 

We  are  going  to  prove  the  result  about  hcopy  by  induction,  which  would  be 
very  cumbersome  if  an  inductive  assumption  of  this  rather  lengthy  form  were 
required.  Fortunately,  HASH-CONS  satisfies  the  powerful  frame  axiom  that,  if 
hconsp(x)  holds  in  some  state  S,  then  it  holds  in  every  successor  state.  This 
axiom  allows  us,  without  risk  of  unsoundness,  to  omit  further  references  to 
ciBu.’ing  states  and  us?  the  simple  statement  of  tne  theorem. 

Proof.  First,  suppose  “consp(xa).  Then  equal (xa,  xb)  imDlies  that 
xa=xb.  Also,  hcopy(xa)=xa  and  hcopy( xb)=xb  so  that  the  desired  result  i3 
immediate.  Next  suppose  consp(xa).  Then  consp(xb).  We  proceed  by 

simultaneous  structural  induction  on  xa  and  xb.  That  is,  we  assume 

(11)  (Wx)correct-hcopy(s(xa) ,x) 

(12)  (yx)correct-hcopy( x,s( xb) ) 

where  s is  either  car  or  cdr.  (Clearly,  correct-hcopy  is  symmetric  in  it3  two 
arguments.)  From  equal ( .a,  xb)  it  follows  that 

(13)  equal ( car(xa) , car(xb)) 

(14)  equal ( odr( xa) , cdr(xb)) 

Combining  these  results  with  II  and  12  we  obtain 

(15)  hcopy (car(xa))=hcopy(car(xb)) 

(16)  hcopy(cdr(xa) )=hcopy(cdr(xb) ) 

We  must  prove  that  he opy ( xa) = hcopy ( xb) . This  is  done  as  follows: 

hcopy(xa)  = hcons(hcopy(car(xa) ) , hcopy(cdr(xa) ) ) 

(by  the  definition  of  hcopy) 

=hcons( hcopy (car( xb) ) , hcopy (cdr( xb) )) 

{by  15,  16} 

=hcopy( xb) 

{by  the  definition  of  hcopy) 


6.  CONCLUSION 


Ihe  design  and  implementation  of  complex  programs  is  presently  expensive 
and  time  consuming.  Moreover,  the  programs  that  result  are  often  obscure  and 
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unreliable.  Yet  the  attempt  to  increase  reliability  by  proving  program 
correctness  can  substantially  add  to  initial  development  costs  if  conventional 
design  and  implementation  techniques  are  followed.  In  this  paper,  we  have 
presented  a formal  discipline  for  constructing  large  programs  and  we  have 
discussed  a mnber  of  programming  language  features  that  make  it  easy  to 
derive  an  executable  p^ogran  from  a formal  implementation.  We  believe  that 
these  techniiues  have  great  potential  for  making  proof  a useful  tool.  Careful 
hierarchical  structuring  in  progran  design  leads  to  cleanly  structured 
programs  and  formal  specification  of  program  modules  provides  a precise  guide 
to  implementors  (as  well  as  to  msnagers).  Also,  the  partitioning  that  is 
inherent  in  the  orograms  that  result,  as  Illustrated  in  our  example,  leads  to 
corresponding  partitioning  in  proofs  about  these  program.  Thus  these 
techniques  can  enable  the  use  of  program  proving  as  an  effective  technique  in 
developing  large  and  complex  systems. 
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VERIFICATION  OF  DATA  TYPES  USING  ALGEBRAIC  AXIOMS 

by 

John  V.  Guttag*,  Ellis  Horowitz*,  and  David  R.  Musser** 


Abstract:  It  is  shown  how  a data  abstraction  is  naturally  expressed  using  algebraic  axioms. 
The  virtue  of  these  axioms  is  that  they  permit  the  formal  specification  of  a data  type  in  a 
representation-independent  manner.  A moderately  complex  example  is  given  which  shows 
how  to  employ  algebraic  axioms  at  successive  levels  of  implementation.  The  major  thrust 
of  the  paper  is  to  show  how  the  use  Of  algebraic  axiomatizations  can  significantly  improve 
the  process  of  proving  the  correctness  of  an  implementation  of  an  abstract  data  type. 
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1.  Introduction 

The  key  problem  in  the  design  and  validation  of  large  software  systems  is  reducing 
the  amount  of  complexity  or  detail  that  must  be  considered  at  any  one  time.  Two  common 
and  effective  approaches  to  accomplishing  this  are  decomposition  and  abstraction. 

One  decomposes  a task  by  factoring  it  into  two  or  more  separable  sub  tasks. 
Unfortunately,  for  many  probiems  the  separable  sub-tasks  are  still  too  complex  to  be 
mastered  in  toto.  The  complexity  of  this  sort  of  problem  can  be  reduced  via  abstraction. 
By  providing  a mechanism  for  separating  those  attributes  of  an  object  or  event  that  are 
relevant  in  a given  context  from  those  that  are  not,  abstraction  serves  to  reduce  the 
amount  of  detail  that  one  need  come  to  grips  with  at  any  one  time. 

If  one  is  to  make  full  use  of  abstraction,  the  availability  of  a good  notation  for 
expressing  abstractions  is  critical.  It  is  obvious  that  a reasonable  language  is  a 
prerequisite  to  the  communication  of  something  as  intangible  as  an  abstraction.  It  is  less 
obvious,  but  equally  true,  that  a reasonable  language  is  a prerequisite  to  the  creation  of 
such  abstractions.  Even  if  the  availability  of  a language  is  not  necessary  for  the  initial 
formulation  of  an  abstraction  (an  argument  we  leave  to  psychologists  and  linguists),  it  is 
certainly  necessary  if  the  abstraction  is  to  be  retained  and  developed  over  any  significant 
period  of  time. 
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A recent  trend  in  programming  is  the  development  of  abstract  data  types  or  data 
abstractions.  In  a data  abstraction,  a number  of  functional  abstractions  are  grouped 
together.  The  clustered  operations  are  related  by  the  fact  that  they,  and  only  they, 
operate  on  a particular  class  or  type  of  object.  Some  typical  data  abstractions  are:  a 
symbol  table,  a priority  queue,  and  a set. 

In  this  paper  we  shall  present  a notation  for  describing  data  abstractions  which  we 
call  algebraic  axioms.  Our  experience  indicates  that  the  resulting  specification  of  a data 
abstraction  using  algebraic  axioms  is  both  rigorous  and  easy  to  comprehend;  see 
[Guttag76b].  However,  the  point  we  wish  to  stress  in  this  paper  is  not  design  but  the  use 
of  algebraic  axioms  for  proofs  of  correctness.  In  Section  4 we  show  how  this  axiomatic 
technique  can  be  profitably  employed  to  prove  the  correctness  of  an  implementation  of  a 
data  abstraction.  The  strength  of  the  technique  is  that  it  factors  the  proving  process  into 
distinct,  manageable  stages,  and  further,  it  simplifies  the  proof  at  each  stage.  In 
[Guttag76c]  we  discuss  an  automated  system  which  processes  an  algebraic  axiomatization 
of  a data  abstraction  in  such  a way  that  correctness  proofs  can  be  carried  out 
semi-automatically,  and,  in  addition,  programs  may  be  tested  before  an  implementation  into 
a conventional  programming  language  is  achieved.  This  coupling  of  testing  and 
correctness  is  a valuable  by-product  of  the  algebraic  axiom  approach  and  is  a strong 
argument  for  its  worth. 


2.  Definitions  and  examples 

Rather  than  presenting  formal  definitions  of  abstract  data  types  and  related 
concepts,  we  prefer  to  give  informal  and  (hopefully)  intuitively  appealing  definitions  and  to 
illustrate  the  main  ideas  with  a number  of  examples.  We  shall  view  a data  type  T as  a 
class  Of  values  ind  a collection  of  operations  on  the  values.  If  the  properties  of  the 
operations  are  specified  only  by  axioms,  we  call  T an  abstract  data  type  or  a data 
abstraction.  An  implementation  of  a data  abstraction  is  an  assignment  of  meaning  to  the 
values  and  operations  in  terms  of  the  values  and  operations  of  another  data  type  or  set  of 
data  types.  A correct  implementation  of  a data  abstraction  is  an  implementation  which  is 
expressed  in  terms  of  correct  data  types  and  which  satisfies  the  axioms. 

An  algebraic  axiom  specification  of  a data  type  T consists  of  a syntactic  and  a 
semantic  description.  The  syntactic  specification  defines  the  names,  domains  and 
ranges  of  the  operations  of  T.  The  semantic  specification  contains  a set  of  axioms  in  the 
form  of  equations  which  relate  the  operations  of  T to  each  other.  The  term  "algebraic"  is 
appropriate  because  the  values  and  axioms  can  be  regarded  as  an  abstract  algebra. 
[Goguen75]  and  [Zilles75]  have  strongly  emphasized  the  algebraic  approach,  developing 
the  theory  of  abstract  data  types  as  an  application  of  heterogeneous  algebras. 
Implementations  are  treated  under  this  approach  as  other  algebras,  and  the  problem  of 
showing  an  implementation  is  correct  is  treated  as  one  of  showing  the  existence  of  a 
homomorphic  mapping  from  one  algebra  to  another.  We  shall  in  this  paper  de-emphasize 
the  explicit  use  of  algebraic  terminology,  preferring  instead  the  terminology  of 
programming.  In  spite  of  this  difference  in  terminology,  there  are  many  similarities 
between  our  approach  and  the  more  purely  algebraic  approach  A significant  technical 
difference  which  does  exist  between  the  two  approaches  is  discussed  in  [Guttag76c]. 

i 

The  choice  of  a language  in  which  to  express  the  specifications  is  of  importance 
We  must  be  able  to  express  the  relationships  among  the  operations  both  precisely  and 
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clearly.  In  addition,  the  specification  language  itself  must  be  axiomatically  defined  to 
facilitate  correctness  proofs.  We  begin  by  assuming  a base  language  with  five  primitives: 
functional  composition,  an  equality  relation  (=),  two  distinct  constants  (TRUE  and  FALSE), 
and  an  unbounded  supply  of  free  variables.  From  these  primitives  one  can  construct  an 
arbitrarily  complex  specification  language;  for  once  an  operation  has  been  defined  in  terms 
of  the  primitives,  it  may  be  added  to  the  specification  language.  An  IF-THEN-ELSE 
operation,  for  example,  may  be  defined  by  the  axioms: 

IF-THEN-ELSE(TRUE,q,r)  = q, 

IF-THEN-ELSE(FALSE,q,r)  = r. 

We  shall  assume  that  the  expression  IF-THEN-ELSE(b,q,r),  which  we  shall  write  as  IF  b 
THEN  q ELSE  r,  is  part  of  the  specification  language.  We  shall  also  assume  the  availability 
of  infix  Boolean  operators  such  as  a,  v,  o,  =,  etc.  Finally,  we  allow  for  the  conventional 
operations  on  integers:  PLUS,  MINUS,  TIMES,  DIV,  MOD,  and  use  the  conventional  infix 
operators  when  convenient. 


2.1  Stack  Example 

One  of  the  simplest  examples  of  an  abstract  data  type  is  the  unbounded  Stack  type. 


Figure  2.1  Stack  Data  Type 

type  Stack[elementtype:  Type] 
syntax 

NEWSTACK  ->  Stack, 

PUSH(Stack,elernenttype)  -»  Stack, 

POP(Stack)  ->  Stack, 

TOP(Stack)  -*  elementtype  U {UNDEFINED}, 

ISNEW(Stack)  ->  Boolean, 

REPLACE(Stack, elementtype)  -*  Stack. 

semantics 

declare  stk.Stack,  elm.elementtype; 

POP(NEWSTACK)  = NEWSTACK, 

POP(PUSH(stk,elm»  = stk, 

TOP(NEWSTACK)  =■  UNDEFINED, 

TOP(PUSH(stk,elm»  = elm, 

ISNEW(NEWSTACK)  = TRUE, 

ISNEW(PUSH(stk,elm))  = FALSE, 

REPLACE(stkelm)  - PUSH(POP(stk),elm). 

In  the  example  of  Figure  2.1  we  have  defined  a data  type  Stack  with  six  operations 
via  a syntactic  specification  of  these  operations,  and  a semantic  specification  which  is  a set 
of  seven  equations  relating  the  operations.  Certain  notational  conventions  exhibited  by 
this  example  will  be  used  throughout.  Operation  names  are  written  using  all  capital 
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letters.  The  name  of  a data  type  begins  with  a capital.  In  the  equations  the  lowercase 
symbols  are  free  variables  ranging  over  the  domains  indicated,  e.g.,  stk  ranges  over  the 
Stack  type.  The  symbol  elementtype  is  a variable  ranging  over  the  set  of  types  and  elrn 
ranges  over  elementtype.  This  says  that  we  can  have  a Stack  of  any  type  of  elements 
(but  all  must  be  of  the  same  type);  what  we  have  defined  is  thus  not  a single  type,  but 
rather  a type  schema.  The  binding  of  elementtype  to  a particular  type, 

e.g.  Stack[lnteger],  reduces  the  schema  to  a specification  of  a single  type.  Using  the 
syntactic  specification  of  the  operations,  it  can  be  checked  that  each  of  the  expressions  in 
the  axiomatic  equations  is  well-formed  in  the  sense  that  each  operator  is  applied  to  the 
correct  number  of  arguments  and  each  argument  has  the  correct  type. 

The  equations  are  statements  of  fact  (axioms)  relating  the  values  which  are  created 
by  the  operations,  e.g.,  the  equation 

TOP(PUSH(stk,elm))  = elm 

means  that  for  any  Stack  value  stk  and  any  elementtype  value  elm,  the  result  of 
PUSH(stk.elm)  is  a Stack  value,  stkl,  such  that  TOP(stkl)  yields  the  value  elm.  In  viewing 
the  equations  in  this  way,  we  are  not  required  to  give  any  particular  interpretation  to  the 
values;  the  "useful"  properties  of  the  values  can  be  derived  solely  from  the  relations 
determined  by  the  axioms.  Thus,  in  designing  computer  implementations  of  the  operations, 
we  are  free  to  represent  the  values  in  many  different  ways. 

An  implementation  of  the  Stack  data  type  which  is  commonly  used  is  an 
implementation  in  terms  of  an  (Array,  Integer)  pair.  Each  Stack  value  is  represented  by  a 
structure  with  two  components  - an  array,  whose  components  are  of  type  elementtype, 
and  an  integer  indicating  the  position  in  the  array  of  the  top  element  of  the  stack.  The 
specifications  for  an  Array  data  type  are  given  in  Figure  2.2. 


Figure  2.2  Array  Data  Type 
type  Array[domaintype:Type,rangetype:Type] 

syntax 

NEWARRAY  -*  Array, 

ASSIGN(Array,domaintype,rangetype)  -*  Array, 

ACCESS(Array,domamtype)  -»  rangetype. 

semantics 

declare  arriArray,  dval.dval  1 idomamtype,  rvalirangetype; 

ACCESSfNEWARRAY.dval)  = UNDEFINED, 

ACCESS(ASSIGN(arr,dval,rval),dvall)  = 

IF  dval  - dvall  THEN  rval  ELSE  ACCESS(arr,dval  1 ). 

ASSIGN(arr,t,e!m)  means  the  array  identical  to  arr  except  possibly  in  the  t-th 
position,  where  the  value  is  elm  [McCarthy63],  ACCESS(arr.t)  returns  the  value  in  position 
t of  the  array  arr. 
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The  implementation  of  the  Stack  data  type  with  (Array,  integer)  pairs  is  given  in 
Figure  2.3.  We  have  divided  the  implementation  into  a representation  part  and  a programs 
part.  This  corresponds  to  the  division  of  the  specification  into  a syntactic  part  and  a 
semantic  part. 


Figure  2.3  An  Implementation  of  the  Stack  Data  Type  with  (Arrayjnteger)pairs 
representation  STAK(Array[lnteger,elementtype],lnteger)  ->  Stack[elementtype], 
programs 

declare  arr:Array,  t:lnteger,  elmrelementtype;  f 

NEWSTACK  = STAK(NEWARRAY,0), 

PUSH(STAK(arr,t),elm)  = STAK(ASSIGN(arr,t+l,elm),t+l), 

POP(STAK<arr,t))  - IF  t=0  THEN  STAK(arr.O)  ELSE  STAK(arr.t-l), 

TOP(STAK(arr,t))  = ACCESS(arr.t),  ** 

ISNEW(STAK(arr,t»  = (t=0), 

REPLACEfSTAK(arr,t),elm)  = 

IF  t=0  THEN  STAK(ASSIGN(arr,l,elm),l ) 

ELSE  STAK(ASSIGN(arr,tIelm),t). 

In  this  paper  the  language  used  to  express  programs  is  the  same  as  the  language 
for  expressing  axioms.  Though  we  recognize  that  a richer  language  is  usually  more 
desirable,  we  have  chosen  to  restrict  ourselves  here  for  several  reasons.  Most 
importantly,  the  proof  procedure  described  in  Section  4 derives  much  of  its  simplicitly 
from  the  use  of  this  restricted  set  of  constructs.  Since  all  conventional  programming 
language  features  can  be  automatically  translated  into  our  basic  set,  see  [Manna74],  no 
real  advantage  is  lost.  We  are  able  to  avoid  issues  of  language  design  and  concentrate  on 
how  the  basic  command  set  can  be  axiomatizcd,  used  for  correctness  proofs,  and  used  to 
synthesize  implementations.  The  clarity  of  the  basic  language  for  axiomatizing  data  types 
has  been  shown  in  [Guttag76b]  and  [Horowitz76], 

We  intend  that  all  of  the  operations  be  purely  "functional"  or  "applicative,"  i.e.,  have 
no  side  effects.  This  can  imply  an  unrealistic  degree  of  inefficiency  for  implementations. 
In  the  Stack  implementation  of  Figure  2.3,  for  example,  the  implementation  of  PUSH  must 
involve  copying  the  Array  component  as  well  as  the  Integer  component  of  the  Stack 
representation.  The  basic  framework  can  be  extended  to  permit  specification  of 
operations  with  side  effects,  so  that  the  obvious  efficient  implementations  are  possible. 
However,  since  the  exposition  of  our  proof  techniques  is  facilitated  by  this  restriction,  we 
will  continue  to  assume  no  side  effects  in  this  paper,  and  refer  the  reader  to  [Guttag76c] 
for  a discussion  of  the  extensions  required  to  remove  this  restriction. 

The  correctness  of  an  implementation  of  a data  type  <Jdn  be  proved  by  showing  that 
each  axiom  of  the  semantic  specification  is  satisfied  by  )he  programs.  As  a particularly 
simple  example  of  such  a proof,  consider  the  fourth  Stack  axiom.  Assuming 
stk-STAK(arr.t), 


TOP(PUSH(stk,elm»  = TOP(PUSH(STAK(arr,t),elm» 

= TOP(ST  AK(  AS$IGN(arr,t  1 ,elm),t  + 1 )) 

- ACCESS)  ASSIGN(arr,t  + 1 ,elm),t  + 1 ) 

= elm. 

The  other  Stack  axioms  can  be  shown  to  be  satisfied  in  a similar  manner,  although  not 
quite  so  straightforwardly.  The  complications  that  arise  will  be  dealt  with  in  Section  4, 
which  discusses  verification  of  implementations  in  detail. 


2.2  Programs  as  axioms  and  axioms  as  programs 

In  the  discussion  of  the  implementation  for  the  Stack  data  type,  we  described 
STAK(arr.t)  as  a pair  whose  first  component  is  an  Array  and  second  component  is  an 
Integer,  and  viewed  equations  such  as 

TOP<STAK(arr,t))  - ACCESS(arr.t) 

as  definitions  of  programs  for  operating  on  the  STAK  pairs.  Suppose,  however,  that  we 
now  view  STAK  as  an  operation,  whose  syntactic  specification  is 
STAK(Array[lnteger,elementtype],lnteger)-»Stack[elementtype].  Then  the  above  equation 
for  TOP  and  the  other  program  equations  can  be  viewed  as  axioms  which  comprise  a 
semantic  specification  for  STAK.  As  an  axiom,  we  would  read  the  TOP  equation  as  “if  stk 
is  the  result  of  applying  STAK  to  an  Array  arr  and  an  Integer  t,  the  value  returned  by 
TOP(stk)  is  ACCESS(arr.t)." 

As  an  axiomatic  specification  of  the  Stack  data  type,  the  implementation  of  Figure 
2.2  is  inferior  to  the  specification  of  Figure  2.1,  in  that  it  is  not  self-contained  (it  requires 
knowledge  of  properties  of  Arrays  and  Integers).  We  have  called  attention  to  the  view  of 
programs  as  axioms  mainly  because  it  suggests  a duality  between  programs  and  axioms 
whose  other  half  - axioms  as  programs  - can  be  very  fruitfully  exploited. 

We  can,  in  fact,  view  the  axioms  of  Figure  2.1  as  programs,  simply  by  regarding 
NEWSTACK  and  PUSH(stk,elm)  as  trees  rather  than  operations.  All  structures  built  with 
NEWSTACK  and  PUSH  can  be  pictured  as  trees.  For  example, 

PUSH(PUSH(NEWSTACK,3),7) 

can  be  diagrammed  as 


TpushI 


z; 


/ X 

/ 

[NEWS  TACK | 


\ 


The  Stack  axioms  can  be  viewed  as  defining  operations  which  produce  and  access  such 
tree  structures:  . , 
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NEWSTACK 


PUSH(stk,elm)  »jPUSH) 

ss 


stk  elm 


POP(  (new  STACK] ) =|NEWSTACK| 


POP(lPt^Hl)  = stk 
stk  elm 


etc. 

The  two  equations  for  POP  together  define  POP  as  an  operation  which  first  checks  which 
kind  of  node  it  is  given  and  then  proceeds  accordingly.  This  is  an  example  of  a direct 
implementation. 

Direct  implementations  are  useful  from  a number  of  standpoints.  In  the  first  place, 
the  concept  of  a direct  implementation  can  serve  as  an  aid  to  constructing  specifications; 
i.e.,  one  can  try  to  write  the  semantic  axioms  so  that  they  can  serve  as  programs 
operating  on  tree  structures.  If  this  can  be  done,  and  one  has  a compiler  for  actually 
producing  running  implementations  of  such  programs,  then  one  can  experiment  with  the 
operations,  testing  to  a limited  extent  whether  they  have  the  properties  intended.  More 
importantly,  one  can  also  test  high-level  algorithms  which  are  programmed  in  terms  of  the 
data  type,  before  fixing  upon  a particular  implementation  of  the  data  type.  Thus,  a true 
top-down  design  methodology  can  be  achieved. 

Many  examples  of  algebraic  axiomatizations  of  data  types  with  explanations  of  their 
direct  implementations  appear  in  [Guttag76b].  [Guttag76c]  contains  a more  detailed 
discussion  of  direct  implementations  and  the  related  idea  of  "reduction  systems." 


3.  Symbol  table  example 

The  Stack  data  type  is  too  simple,  in  a number  of  respects,  to  properly  illustrate  the 
properties  and  uses  of  algebraic  axiom  specifications.  The  equations  have  too  simple  a 
form  and  the  usable  implementations  are  too  straightforward  to  have  "interesting"  proofs 
of  correctness.  A much  richer  example  is  provided  by  the  symbol  table  data  type,  which 
was  first  specified  algebraically  in  [Guttag75],  [Guttag76a].  In  this  example  we  deal  with  a 
common  but  non-trivial  data  structuring  problem,  the  design  of  a symbol  table  for  a 
compiler  for  a block-structured  language.  We  wish  to  specify  and  implement  a set  of 
operations  for  maintaining  the  symbol  table  during  compilation  of  a program.  An  informal 
specification  of  the  operations  might  be: 


INIT:  allocate  and  initialize  the  symbol  table. 
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ENTER8L0CK: 


prepare  a new  local  naming  scope. 


ADDID:  add  an  identifier  and  its  attributes  to  the  symbol  table. 

LEAVEBLOCK:  discard  entries  from  the  most  current  scope  and  re-establish  the 

next  outer  scope. 

ISINBLOCK:  has  a specified  identifier  already  been  declared  in  this  scope 

(used  to  check  for  duplicate  declaration)? 

RETRIEVE:  return  the  attributes  associated  with  the  most  local  definition  of  a 

specified  identifier. 

The  formal  specification  which  we  shall  adopt  is  given  in  Figure  3.1: 


Figure  3.1  The  SymboLtable  Data  Type 

type  Symboltable 
syntax 

INIT  -*  Symboltable, 

ENTERBLOCK(Symbo)fable)  ->  Symboltable, 

ADDID(Symboltable, Identifier, Attributelist)  ->  Symboltable, 
LEAVEBLOCK(Syrnboltable)  -*  Symboltable, 

ISINBLOCK(Symboltable, Identifier)  -*  Boolean, 

RETRIEVE(Symboltable, Identifier)  ->  Attributelist  u {UNDEFINED}. 

semantics 

declare  symtab:Symboltable,  id.idl:  Identifier,  attrlist:  Attributelist; 

1)  LEAVEBLOCK(INIT)  - INIT, 

2)  LEAVEBLOCK(ENTERBLOCK(symtao))  = symtab, 

3)  LEAVEBLOCK(ADDID(symtab, id, attrlist))  = LEAVEBLOCK(symtab), 

4)  ISINBLOCKONIT.id)  - FALSE, 

5)  ISINBLOCK(ENTERBLOCK(symtab),id)  = FALSE, 

6)  ISINBLOCK(ADDID(symtab, id, attrlist), id  1 ) ■= 

IF  id  *=  idl 

THEN  TRUE 

ELSE  ISINBLOCK(symtab,idl ), 

7)  RETRIEVE(lNlT.id)  = UNDEFINED, 

8)  RETRIEVE(ENTERBLOCK(symtab),id)  = RETRIEVE(symtab,id), 

9)  RETRIEVE(ADDID(symtab,id,attrlist),idl)  » 

IF  id  «=  id  1 

THEN  attrlist 

ELSE  RETRIEVE(symtab.idl). 

As  an  aid  to  understanding  these  axioms,  it  is  useful  to  consider  a direct 
implementation.  We  let  the  representation  be  trees  of  INIT,  ENTERBLOCK,  and  ADDID  nodes 
and  use  the  full  set  of  semantic  axioms  as  programs.  Then,  for  example,  if  this  direct 
implementation  is  used  by  a compiler  in  processing  the  following  program  segment, 
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begin 


real  X,Y; 


begin 

integer  Y; 

end 

end 


the  symbol  table 

SYM=ADDID(ENTERBLOCK(ADDID(ADDID(INIT,X,Real),Y,Real)),Y, Integer) 
will  be  created  within  the  innermost  block.  Diagrammed  as  a tree  structure,  this  is: 


Suppose  now  that  we  apply  RETRIEVE  to  SYM  and  X.  Simulating  the  RETRIEVE  operation 
using  the  direct  implementation,  we  have 


RETRIEVE(SYM.X) 

=[by  axiom  9]->  RETRIEVE( (INTERBLOCK) , X) 


■[by  axiom  8]=>  RETRIEVEt  'aDDIpI  , X) 


■[by  axiom  9]->  RETRlEVE(  ADDIDj,  X)  -[by  axiom  9]*>  Real 


Real 
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If  free  structure  operations  are  implemented  with  reasonable  efficiency,  then  this 
direct  implementation  could  actually  be  used  in  a compiler  to  test  the  Symboltable 
specification  more  extensively,  and,  simultaneously,  to  test  other  components  of  the 
compiler.  However,  because  this  implementation  requires  potentially  long  searches  for  the 
RETRIEVE,  ISINBLOCK,  and  LEAVEBLOCK  operations,  it  is  not  very  efficient. 

The  design  of  a hierarchically  structured  implementation  of  the  Symboltable  data 
type,  using  algebraic  specifications  of  the  data  types  employed  at  each  level  of  the 
implementation,  is  carried  out  in  detail  in  [Guttag76c],  Such  a design  was  first  presented 
in  [Guttag75],  where  the  lowest  level  of  the  implementation  was  expressed  as  a set  of 
PL/l-like  programs.  The  presentation  in  [Guttag76c]  differs  in  that  it  uses  the  restricted 
set  of  language  features  described  in  Section  2.  For  brevity  we  present  here  only  the 
top  level  of  the  implementation  development. 

If  we  ignored  the  complication  introduced  by  block  structure,  a symbol  table  could 
be  viewed  abstractly  as  providing  a mapping  from  identifiers  to  attribute  lists.  One  way 
to  handle  block  structure,  especially  suitable  in  a one  pass  compiler,  is  to  have  a stack  of 
mappings,  each  mapping  being  from  identifiers  to  attribute  lists,  with  the  top  mapping  on 
the  stack  corresponding  to  the  current  innermost  block  being  processed.  This  is  the 
method  we  have  chosen  in  the  implementation  given  in  Figure  3.2. 

F igure  3.2  An  Implementation  of 
the  Symboltable  Data  Type  u/lth  a Stack  of  Mappings 

representation 

SYMT(Stack[Mapping[ldentifier,Attribuielist]])  -»  Symboltable. 

programs 

declare  stk:Stack,  idildentifier,  attrlist:Attributelist; 

INIT  - SYMT(PUSH(NEWSTACK,NEWMAP)), 

ENTERBLOCKISYMT(stk))  = SYMT(PUSH(stk,NEWMAP)>, 

ADDID(SYMT(stk),id,attrlist)  = 

SYMT(REPLACE(stk,DEFMAP(TOP(stk),id,attrlist))), 

LEAVEBLOCK(SYMT(stk»  = 

IF  ISNEWIPOP(stk)) 

THEN  SYMT(REPLACE(stk,NEWMAP)) 

ELSE  SYMT(POP(stk)), 

ISINBLOCK($YMT(stk),id)  = ISDEFINED(TOP(stk),id), 

RETRIEVE(SYMT(stk),id)  = 

IF  ISNEW(stk) 

THEN  UNDEFINED 

ELSE  IF  ISDEFINED(TOP(stk),id) 

THEN  EVMAP(TGP(stk),id) 

ELSE  RETRIEVE(SYMT(POP(stk)),id). 


This  implementation  uses  the  operations  of  the  Stack  data  type  '.ch»"  « 
and  the  Mapping  data  type  schema  of  Figure  3.3.  Note  that  we  ha  • 
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parameters  of  the  Stack  and  Mapping  types.  The  Mapping  data  type  is  the  same  as  the 
Array  data  type  of  Figure  2.2,  except  for  the  addition  of  an  ISDEFINED  operation. 


Figure  3.3  Mapping  Data  Type 
type  Mapping[domaintype:Type,rangetype:Type] 
syntax 

NEWMAP  -*  Mapping, 

DEFMAP(Mapping,domamtype,rangetype)  -*  Mapping, 

EVMAP(Mappmg,domaintype)  -•  rangetype, 

ISOEFINED(Mapping,domaintype)  -*  Boolean. 

semantics 

declare  map:Mapping,  dval, rival  1 :domaintype,  rvalrrangetype; 

EVMAP(NEWMAP,dval)  - UNDEFINED, 

EVMAP(DEFMAP(map, dval, rval), dval  1 ) = 

IF  dval  » dvall  THEN  rval  ELSE  EVMAP(map,dvall), 

ISDEFINED( NEWMAP, dval  1 ) - FALSE, 

ISDEFINED(DEFMAP(map, dval, rval), dval  1 ) « 

IF  dval  - dvall  THEN  TRUE  ELSE  ISDEFINED(map.dvall). 

The  next  level  of  the  implementation  discussed  in  [Guttag76c]  contains  the 
implementation  of  the  Stack  data  type  in  terms  of  (Array, Integer)  pairs,  as  discussed  in  the 
previous  section,  and  an  implementation  of  the  Mapping  data  type  in  terms  of  a particular 
kind  of  hash  table  (an  array  containing  lists  of  domain/range  pairs).  The  discussion  of 
verification  methods  in  the  next  section  does  not  rely  on  this  material,  and  we  omit  it  here. 


4.  Proving  correctness 

In  this  section  we  turn  to  the  problem  of  proving  correctness  of  implementations  of 
data  types.  We  shall  continue  to  center  the  discussion  around  the  example  of  the 
] Symboltable  data  type.  We  shall  show  parts  of  the  proof  of  correctness  of  the  example 
implementation  given  in  Section  3. 

One  of  the  most  important  aspects  of  the  proof  techniques  which  will  be  used  to 
prove  correctness  of  algebraically  specified  data  types  is  that  the  proofs  can  be  factored 
into  levels  corresponding  to  the  levels  of  the  implementation.  To  prove  the  correctness  of 
a data  type  which  is  implemented  in  terms  of  other  data  types,  we  need  to  rely  only  on 
the  axiomatic  specifications  of  the  other  data  types,  not  on  their  implementations.  For 
example,  as  we  will  show  in  Section  4.2,  to  verify  the  top  level  of  the  implementation  of 
the  Symboltable  data  type,  we  use  the  semantic  axioms  for  the  Stack  and  Mapping  data 
types,  and  ignore  their  implementations. 

Another  highly  significant  aspect  of  the  use  of  axioms  for  other  data  types  in  the 
proof  of  an  implementation  is  the  computational  nature  of  the  proof  steps:  the  axioms  are 
used  as  rewrite  rules  and  proofs  proceed  via  a series  of  reductions.  This  aspect  is  also 
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illustrated  in  Section  42.  The  main  implication  of  this  is  that  to  a large  extent  the  proof 
process  can  be  easily  automated. 

Two  extremely  important  considerations  in  verifications  of  data  type 

implementations  are  data  type  invariants  and  equality  interpretations.  Again,  the 
algebraic  axiom  approach  seems  particulary  well  suited  to  dealing  with  these 
considerations.  Section  43  discusses  data  type  invariants,  and  we  refer  the  reader  to 
[Guttag76c]  for  a discussion  of  equality  interpretations.  Section  4.4  contains  a brief 
discussion  of  an  interactive  system  for  verifying  implementations  of  data  types  which  is  in 
use  at  the  USC  Information  Sciences  Institute. 


4.1  Formal  deduction  using  the  Boolean  data  type 

Although  we  will  not  be  completely  formal  in  the  example  proofs  in  the  following 
sections,  it  is  useful  to  discuss  at  this  point  the  basis  we  have  chosen  for  automating  such 
proofs,  since  it  fits  very  well  with  the  overall  approach  of  using  algebraically  specified 
data  types.  In  fact,  many  of  the  formal  deductions  will  be  based  on  the  following  algebraic 
specification  of  the  Boolean  data  type: 


Figure  4.1  Boolean  Data  Type 


type  Boolean 
syntax 

TRUE  Boolean, 

FALSE  -*  Boolean, 

(IF  Boolean  THEN  Boolean  ELSE  Boolean)  -*  Boolean, 
(Boolean  a Boolean)  -♦  Boolean, 

(Boolean  v Boolean)  -*  Boolean, 

-Boolean  -»  Boolean, 

(Boolean  a Boolean)  -♦  Boolean, 

(Boolean  « Boolean)  -»  Boolean. 

semantics 

declare  p,q,r:Booleanj 

(IF  TRUE  THEN  q ELSE  r)  - q, 

(IF  FALSE  THEN  q ELSE  r)  - r, 

(p  a q)  - IF  p THEN  q ELSE  FALSE, 

(p  v q)  - IF  p THEN  TRUE  ELSE  q, 

-p  - IF  p THEN  FALSE  ELSE  TRUE, 

(p  s q)  ■ IF  p THEN  q ELSE  TRUE, 

(p  « q)  - IF  p THEN  q ELSE  -q, 


In  this  specification  each  of  the  usual  logical  operators  is  related  by  an  axiom  to  the 
IF-THEN-ELSE  operator,  which  is  axiomatized  by  the  first  two  axioms  relating  it  to  TRUE 
•nd  FALSE.  The  use  of  a,  v,  etc.  in  infix  rather  than  prefix  form  is  purely  for 


convenience.  Although  the  «bove  specification  defines  IF-THEN-ELSE  only  for  Boolean 
operands,  we  assume  that  every  data  type  T can  use  IF-THEN-ELSE  with  the  syntactic 
specification 

(IF  Boolean  THEN  T ELSE  T)  -*  T 

and  the  same  axioms  as  the  first  two  in  the  above  specification. 

We  shall  make  frequent  use  of  the  following  rewrite  rules  for  IF-THEN-ELSE,  which 
are  theorems  provable  from  the  Boolean  axioms: 


1.  [Repeated  result  rule]  (IF  p THEN  q ELSE  q)  - q 

2.  [Redundant  IF  rule]  (IF  p THEN  TRUE  ELSE  FALSE)  - p 

3.  [IF-dis»ribution  rule]  (IF(IF  p THEN  q ELSE  r)  THEN  a ELSE  b)  - 

IF  p THEN  (IF  q THEN  a ELSE  b)  ELSE  (IF  r THEN  a ELSE  b) 

4.  [Logical  substitution  rules] 

(IF  p THEN  q[pj  ELSE  r)  - (IF  p THEN  q[TRUE  for  p]  ELSE  r) 

(IF  p THEN  q ELSE  r[p])  - (IF  p THEN  q ELSE  r[FALSE  for  p]). 


In  the  left  hand  side  of  a rule,  "q[p]“  means  q must  be  an  expression  in  which  p occurs  as 
a subexpression  (possibly  p«q).  In  the  right  hand  side,  "q[TRUE  for  p]"  is  the  result  of 
substituting  TRUE  for  all  occurrences  of  p in  q.  We  require  that  p occur  as  a 
subexpression  of  q to  limit  applicability  of  the  rule  to  those  places  where  it  will  effect  a 
change.  Rules  1-4  are  also  theorems  for  IF-THEN-ELSE  in  other  data  types. 

As  noted  in  [McCarthy63]  and  [Boyer75],  the  Boolean  axioms  and  rules  1-4  combine 
to  yield  a system  for  “simplifying"  expressions  of  the  propositional  calculus,  such  as: 


a)  A a TRUE, 

b)  (A  v B)  v A, 

c)  ((A  a B)  a A)  a (B  a C) 

d>  ((A  v B)  3 C)  ■ ((A  3 C)  a (B  3 C)) 

For  example, 

((A  v B)  v A)  -«[by  v axiom]=-> 

(IF  (IF  A THEN  TRUE  ELSE  B)  THEN  TRUE  ELSE  A) 

— [bv  IF-distribution  rule]-«> 

(IF  A THEN  (IF  TRUE  THEN  TRUE  ELSE  A) 

ELSE  (IF  B THEN  TRUE  ELSE  A)) 

■■[by  IF  axiom]»“> 

(IF  A THEN  TRUE  ELSE  (IF  B THEN  TRUE  ELSE  A)) 
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“-[by  logical  substitution  rule]— > 

(IF  A THEN  TRUE  ELSE  (IF  0 THEN  TRUE  ELSE  FALSE)) 

—[by  Redundant  IF  rule]— > (IF  A THEN  TRUE  ELSE  B) 


The  result  is  equivalent  to  (A  v B)  Similarly,  a)  can  be  reduced  to  A,  c)  can  be  reduced  to 
a form  equivalent  to  (A  a B)  o C,  and  d)  can  be  reduced  to  TRUE. 

In  [Guttag76c]  we  in  fact  prove  that  the  Boolean  axioms  and  rules  1-4  are  complete 
with  respect  to  the  propositional  calculus,  in  that  any  valid  expression  in  propositional 
calculus  (i.e.,  provable  by  truth  table)  is  reducible  to  TRUE  by  use  of  these  rewrite  rules. 
These  rules  form  the  basis  for  an  automatic  simplifier  for  logical  expressions  described  in 
[Guttag76c]. 

It  is,  of  course,  necessary  to  go  beyond  propositional  calculus  and  include  deductive 
rules  for  equality  and  other  operators.  We  will  not  go  in  to  the  details  of  the  formal  rules 
here,  except  to  mention  the  following  important  rule: 


5.  [Case  analysis  rule]  f(aj (IF  p THEN  x ELSE  y),  . . . ,an)  - 

IF  p THEN  f(aj x an) 

ELSE  f(aj,  . . . ,y,  . . . ,an) 
when  f + IF-THEN-ELSE. 


This  is  a "second-order"  rewrite  rule  and  the  rule  applies  to  IF-THEN-ELSE  expresssions  in 
any  operand  position.  The  case  in  which  f is  IF-THEN-ELSE  is  already  covered  in  part  by 
the  IF-distribution  rule.  Note  that  if  f were  permitted  to  be  IF-THEN-ELSE  then  the 
expression  IF  aj  THEN  (IF  p THEN  x ELSE  y)  ELSE  a^  could  be  transformed  to  IF  p THEN  (IF 
aj  THEN  x ELSE  a^)  ELSE  (IF  aj  THEN  y ELSE  a3>  and  the  rule  would  apply  again,  leading 
to  an  infinite  sequence  of  applications.  An  important  application  of  this  case  analysis  rule 
occurs  when  f is  e.g. 

((IF  p THEN  x ELSE  y)  - z)  —[by  Case  analysis  rule]— > (IF  p THEN  (x-z)  ELSE  (y-z)). 


4.2  Verification  of  one  of  the  Symboltable  axioms 

The  basic  proof  technique  for  verifying  an  implementation  of  a data  type  is  to  show 
that  each  of  the  axiomatic  specifications  for  the  data  type  is  satisfied,  when  the  programs 
for  the  operations  of  the  data  type  are  substituted  into  the  axioms.  As  our  first  example, 
consider  the  ninth  axiom  for  the  Symboltable  data  type: 

RETRIEVE(ADDlD(symt ab,id,at t rlist ),id  1 ) - 
IF  id  - idl  THEN  attrlist 

ELSE  RETRIEVE(symtab,idl).  (4.2-1) 

To  get  started,  we  assume  there  exists  a stack  stk  such  that  symtab  - SYMT(stk).  (We  will 
show  in  Section  4.3  how  to  verify  this  assumption.)  Substituting,  we  get  the  verification 
condition 
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RETRIEVE(ADDID(SYMT(stk),id,attrlist),idl)  - 
IF  id  - id  1 THEN  attrlist 

ELSE  RETRIEVE(SYMT(stk),idl).  (4.2-2) 


The  goal  now  is  to  show  that  this  equation  is  true  using  the  programs  for  RETRIEVE  and 
ADDID  (Figure  3.1)  and  the  axioms  for  the  Stack  and  Mapping  data  types  (Figures  2.1  and 
3.2)  as  rewrite  rules.  We  must  also  use  the  axioms  and  some  theorems  for  the  Boolean 
data  type,  as  discussed  in  Section  4.1. 

Working  first  on  the  left  hand  side  of  (4.2-2),  we  make  the  following  reductions: 

LHS  --[by  ADDID  program]— > 

RETRIEVE(SYMT(stkl),idl)  where  stkl  - REPLACE(stk,DEFMAP(TOP(stk), id, attrlist)) 


““[by  RETRIEVE  program]— > 

IF  ISNEW(stkl) 

THEN  UNDEFINED 

ELSE  IF  ISDEFINED(TOP(stkl),idl) 

THEN  EVMAP(TOP(stkl),idl) 

ELSE  RETRIEVE(SYMT(POP(stk  1 )>,id  1 ) 

““[by  REPLACE,  ISNEW,  POP  and  TOP  axioms]—> 

IF  ISDEFINED(DEFMAP(TOP(stk),id,at»rlist),idl) 

THEN  EVMAP(DEFMAP(TOP(stk),id,attrlist),idl) 
ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

--[by  ISDEFINED  axiom]— > 

IF  (IF  id  - id  1 THEN  TRUE  ELSE  ISDEFINED(TOP(stk),idl» 

THEN  EVMAP(DEFMAP(TOP(stk), id, attrlist), id  1 ) 
ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

—[by  IF -distribution  Rule]— > 

IF  id  - idl 
THEN  IF  TRUE 

THEN  EVMAP(DEFMAF(TOP(stk), id, attrlist), id  1 ) 

ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

ELSE  IF  ISDEFINED(TOP(stk),idl) 

THEN  EVMAP(DEFMAP(TCP(stk), id, attrlist), idl/ 

ELSE  RETRIEVE(SYMT(POP(stk)),idl) 

—[by  IF  axiom]— > 

IF  id  - idl 

THEN  EVMAP(DEFMAP(TOP(stk),id,attrlist),idl) 

ELSE  IF  ISDEFINED(TOP(stk),idl) 

THEN  EVMAP(DEFMAP(TOP(stk),id,attrlist),idl) 

ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

■■[by  EVMAP  axiom]— > 

IF  id  - idl 

THEN  (IF  id  - idl  THEN  attrlist  ELSE  EVMAP(TOP(stk),idl)) 
ELSE  IF  ISDEFINED(TOP(stk),idl) 
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THEN  (IF  id  - id  1 THEN  attrlist  ELSE  EVMAP(TOP(stk),idl)) 

ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

■■[by  Logical  substitution  rule]»«> 

IF  id  - idl 

THEN  (IF  TRUE  THEN  attrlist  ELSE  EVMAP(TOP(stk),idl» 

ELSE  IF  ISDEFINED(TOP(stk),id  1 ) 

THEN  (IF  FALSE  THEN  attrlist  ELSE  EVMAP(TOP(stk),idl )) 

ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

■■[by  IF  axioms]--> 

IF  id  - idl 

THEN  attrlist 

ELSE  IF  ISDEFiNED(TOP(stk),idl) 

THEN  EVMAP(TOP(stk),id  1 ) 

ELSE  RETRIEVE(SYMT(POP(stk)),id  1 ) 

Having  already  substituted  once  for  RETRIEVE,  we  do  not  substitute  again  for  the  recursive 
call.  The  right  hand  side  of  (4.2-2)  reduces  to  the  same  expression  upon  use  of  the 
program  for  RETRIEVE.  Thus  (4.2-2)  has  been  shown  to  be  true,  and  axiom  (4.2-1)  has 
been  verified  for  the  implementation. 

The  reason  for  showing  the  proof  of  the  axiom  in  as  much  detail  as  we  have  is  that 
we  wish  to  make  clear  that  each  step  of  the  proof  is  a straightforward  application  of  a 
reduction  rule.  The  only  choice  to  be  made  in  terms  of  "proof  strategy"  is  which  of 
several  possible  reductions  to  apply  at  the  next  step.  It  should  be  noted,  however,  that 
the  use  of  the  axioms  as  left-to-right  rewrite  rules  can  be  combined  with  restrictions  on 
the  form  of  the  axioms  (discussed  in  [Guttag76c])  to  preclude  "infinite  loops”  in  the  proof 
process.  Thus,  assuming  that  the  axioms  are  satisfiable,  all  possible  reduction  sequences 
will  terminate  with  the  same  result  [Rosen73].  The  reduction  process  is,  therefore,  readily 
automated.  The  above  proof  was  carried  out  fully  automatically  by  the  prototype  data 
type  verification  system  at  USC  Information  Sciences  Institute. 

Verification  of  the  axioms  in  the  manner  of  the  above  proof  actually  only  establishes 
partial  correctness  of  the  implementation,  i.e.,  that  if  the  programs  terminate  they  give 
results  which  satisfy  the  axioms.  Proof  of  termination  must  be  done  separately. 
However,  implementations  o t data  types  are  often  simple  enough  that  termination  is 
obvious,  and  we  shall  not  deal  with  the  issue  of  formal  proofs  of  termination  in  this  paper. 


4.3  Use  and  verification  of  data  type  invariants 

Although  the  example  axiom  verification  of  the  previous  section  was  long  and 
required  the  use  of  many  different  rewrite  rules,  it  did  not  illustrate  the  use  of  data  type 
invariants.  Consider,  for  example,  the  second  symbol  table  axiom: 

LEAVEBLOCK(ENTERBLOCK(symtab))  - symtab. 

Substituting 


symtab  - SYMT(stk) 


(4.3-1) 


and  applying  all  possible  reductions  (including  the  case  analysis  rule  with  f - we 
obtain 

IF  ISNEW(stk)  THEN  (SYMT(PUSH(NEWSTACK,NEWMAP))  - SYMT(stk)) 

ELSE  (SYMT(stk)  - SYMT(stk)).  (4.3-2) 

We  cannot  reduce  this  equation  to  TRUE  unless  we  know  that 

ISNEW(stk)  - FALSE.  (4.3-3) 

To  prove  (4.3-3),  we  recall  that  stk  is  not  just  an  arbitrarily  chosen  stack,  but  one  which 
was  assumed  to  be  generated  as  a representation  of  a symbol  table.  If  we  examine  the 
syntactic  specification  of  the  Symboltable  data  type,  we  see  that  the  only  operations 
j which  produce  symbol  tables  as  their  output  are  INIT,  ENTERBLOCK,  ADDID,  and 
LEAVEBLOCK.  Examining  the  program  for  each  of  these  operations,  we  see  that  INIT 
generates  ar,  initial  stack,  stk,  for  which  (4.3-3)  is  true,  and  that  if  (4.3-3)  is  true  of  the 
stack  representing  the  symbol  table  argument  of  any  of  the  other  operations,  then  it  is 
true  of  the  stack  produced  in  the  result.  Therefore,  (4.3-3)  must  be  true  of  all  stacks 
produced  as  representations  of  symbol  tables  by  operations  of  the  Symboltable  data  type. 

The  general  principle  being  used  in  the  above  proof  is  that  of  data  type  induction 
(called  "generator  induction"  in  [Spitzen75]  and  [Wegbreit76]).  Paraphrasing  the 
discussion  in  [Spitzen75],  p.  141,  we  suppose  that  a data  type  T has,  according  to  its 
syntactic  specification,  exactly  the  operations  F^,  . . . ,F(  whose  range  is  the  set  of 
values  of  T.  Let  P(x)  be  a property  of  values  of  type  T.  Then  if  the  truth  of  P for 

arguments  of  type  T of  each  Fj  implies  the  truth  of  P for  the  results  of  calls  of  F.  allowed 

by  the  syntactic  specification  of  T,  then  it  follows  that  P is  true  of  all  values  of  the  data 
type.  Assuming  strong  type-checking,  the  validity  of  this  rule  follows  by  induction  on  the 
number  of  computation  steps  involving  values  of  type  T.  As  Spitzen  and  Wegbreit  point 
out,  the  data  type  induction  principle  “is  analog  to  the  principle  of  complete  induction  over 
the  integers.  As  with  complete  induction,  one  of  the  results  which  must  be  established  is 
the  base  step,  that  P is  true  of  the  results  of  those  primitives  F with  no  arguments  of  type 
T."  In  the  case  of  symbol  tables,  INIT  is  the  only  such  primitive. 

Let  us  examine  the  proof  of  (4.3-3)  by  data  type  induction  a bit  more  carefully.  We 
can  regard  a property  Pj  that  we  wish  to  prove  about  symbol  tables  by  data  type 
induction  as  an  operation  on  symbol  tables  with  the  syntactic  specification 

Pj(Symboltable)  -*  Boolean, 
and  the  semantic  specifications 

Pj(INIT) 

Pj(symtab)  a Pj(ENTERBLOCK(symtab)) 

Pj(symtab)  o P j(AODID(s/mtab,id,attrlist) 

Pj(symtab)  a Pj(LEAVEBLOCK(symtab))  (4.3-4) 


which  can  be  generated  automatically  from  the  syntactic  specification  of  the  Symboltable 
data  type.  To  prove  (4.3-3),  we  let  the  interpretation  of  Pj  in  terms  of  the 
implementation  values  be 
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Pj(SYMT(stk))  - (ISNEW(stk)  - FALSE). 


(4.3-5) 


We  then  prove  each  of  the  conditions  in  (4.3-4)  under  this  interpretation  of  Pj  along  with 
the  implementation  programs  of  the  Symboltable  operations.  For  example,  using  (4.3-1), 
the  fourth  condition  becomes: 

Pj(SYMT(stk))  = Pj(LEAVEBLOCK(SYMT(stk))) 

““[by  (4.3-5)and  LEAVEBLOCK  program]— > 

(ISNEW(stk)  - FALSE) 

= Pj(IF  ISNEW(POP(stk))  THEN  SYMT(REPLACE(stk,NEWMAP)) 

ELSE  SYMT(POP(stk») 


—[by  case  analysis  rule]— > 

(ISNEW(stk)  - FALSE) 

= (IF  ISNEW(POP<stk»  THEN  Pj(SYMT(REPLACE(stk,NEWMAP») 

ELSE  Pi<SYMT(POP(stk)))) 

—[by  (4.3-5)]— > 

(ISNEW(stk)  - FALSE) 

= (IF  ISNEW(POP(stk»  THEN  ISNEW(REPLACE(stk,NEWMAP))  - FALSE 

ELSE  ISNEW(POP(s»k))  - FALSE) 

—[by  ISNEW  axiom  and  logical  substitution  rule]— > 

(ISNEW(stk)  - FALSE) 

3 (IF  ISNEW(POP(stk))  THEN  FALSE  - FALSE 

ELSE  FALSE  - FALSE 

which  reduces  to  TRUE  with  the  application  of  the  reflexive  property  of  Boolean  equality, 
the  repeated  result  rule,  and  the  3 axiom. 

A property  P which  is  true  of  all  values  of  a data  type  is  called  an  invariant  of  the 
data  type.  If  the  proof  requires  interpretation  of  P in  terms  of  the  implementation,  then  it 
is  called  an  implementation  invariant  (of  the  data  type).  Thus,  (with  the  establishment  of 
the  other  three  conditions  in  (4.3-4)),  we  have  almost  completed  showing  that  (4.3-3)  is  an 
implementation  invariant  of  the  Symboltable  data  type.  What  remains  to  be  shown  is  that 

P(symtab)  - (3  stk  < Stack,  symtub  - SYMT(stk))  (4.3-6) 

is  an  implementation  invariant.  Again,  this  can  be  easily  verified  using  data  type  induction. 
The  invariant  (4.3-6)  is  an  example  of  a representation  invariant,  which  we  define  to  be 
that  implementation  invariant  which  describes  how  the  abstract  values  are  represented. 
The  representation  invariant  can  be  constructed  automatically  from  the  representation  part 
of  the  implementation. 

As  an  example  of  an  invariant  of  a data  type  other  than  an  implementation  invariant, 
consider 
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P(symtab)  - (symtab  - INIT) 

v (3  symtabl,  symtab  - ENTERBLOCK(symtabl)) 
v (3  symtabl, id, attrlist, 

symtab  - ADDICXsymtabl, id, attrlist)).  (4.3-7) 

which  can  be  shown  to  satisfy  each  of  the  conditions  in  (4.3-4),  using  the  syntactic 
specification  and  semantic  axioms  for  the  Symboltable  data  type.  Invariants  such  as 
(4.3-7)  are  useful  in  proofs  which  make  use  of  the  data  type,  in  that  they  can  be  used  to 
reduce  to  a minimum  the  number  of  cases  which  must  be  considered  in  a proof  by  case 
analysis.  Note  that  (4.3-7)  can  also  be  regarded  as  the  representation  invariant  for  the 
direct  implementation  of  the  Symboltable  data  type,  as  discussed  in  Section  2.3. 

4.4  An  interactive  data  type  verification  system 

The  programs  which  we  have  developed  for  use  in  verifying  data  type 
implementations  are  currently  being  run  as  part  of  an  interactive  verification  system 
[Good75].  The  complete  implementation  of  the  symbol  table  operations  by  a stack  of  hash 
tables  has  been  verified  using  these  programs.  As  the  user-interface  for  the  proof 
process  is  still  under  development,  we  will  not  give  a detailed  description  of  the  commands 
of  the  current  facilities  (called  the  Data  Type  Verification  System),  but  merely  indicate  the 
nature  of  the  commands  and  overall  proof  process  with  an  example,  the  verification  of  the 
top  level  implementation  by  a Stack  of  Mappings.  The  first  step  is  to  direct  the  system  to 
adopt  the  programs  of  data  type  Symboltable,  and  the  axioms  of  data  types  Stack  and 
Mapping.  These  programs  and  axioms  would  all  be  in  the  form  of  rewrite  rules  which  the 
user  had  just  entered  or  had  read  in  from  files.  The  command  for  "adopting"  a set  of 
rules  is  separated  from  the  act  of  reading  them  in  so  that  several  sets  of  rules  for  an 
operator  can  coexist  within  the  system.  Assuming  the  Symboltable  axioms  have  also  been 
input  to  the  system,  the  user  then  directs  the  system  to  generate  the  verification 
conditions  for  the  data  type.  These  would  consist  of  the  Symboltable  axioms  and  the 
equality  axioms  for  the  Symboltable  equality  operator  (see  [Guttag76c]),  all  interpreted  in 
terms  of  the  representation. 

The  user  can  then  attempt  to  prove  each  of  the  verification  conditions  using  CEVAL, 
which  is  an  evaluator  implementing  the  logical  deduction  rules  discussed  in  Section  4.1.  In 
these  proofs  the  rewrite  rules  from  the  Symboltable  programs  and  Stack  and  Mapping 
axioms  aie  used  automatically,  without  further  direction  from  the  user.  In  some  cases,  as 
noted  in  Section  4.3,  completion  of  a proof  will  require  one  or  more  assumptions  to  be 
made  about  the  representation  or  the  Stack  or  Mapping  data  types.  If  this  is  the  case  the 
system  will  stop  with  a reduced  form  of  the  original  verification  condition.  Examination  of 
this  output  will  often  lead  the  user  to  the  necessary  assumptions.  Initially  these 
assumptions  are  input  by  the  user  and  used  as  needed  without  justification.  To  complete 
the  verification  of  the  implementation,  it  is  necessary  to  prove  these  assumptions,  or  a 
stronger  set  of  assumptions,  as  invariants  (of  the  Symboltable  data  type  implementation  or 
of  the  Stack  or  Mapping  data  types).  The  verification  conditions  sufficient  to  establish 
these  invariants  are  constructed  using  the  syntactic  specifications  of  the  data  types,  in 
accordance  with  the  principle  of  data  type  induction,  as  described  in  Section  4.3. 
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5.  Conclusions 


Elsewhere  it  has  been  argued  that  abstract  data  types  can  be  effectively  employed 
as  a “thought  tool"  in  the  structured  development  of  programs  ([Guttag  76b],  [Standish 
73],  [Wulf  76],  [Liskov  75]).  In  this  paper  we  have  attempted  to  show  that  the  use  of 
algebraic  axioms  as  a means  for  describing  data  abstractions  is  also  valuable,  when 
properly  used,  for  both  formal  and  informal  program  validation. 

In  sections  2 and  3 we  discussed  the  axiom  language  and  some  techniques  for 
constructing  algebraic  axiomatizations,  based  on  an  important  duality  between  programs 
and  axioms.  Parts  of  an  implementation  of  a moderately  complex  symbol  table  were 
exhibited  to  show  how  these  techniques  might  be  used  in  practice.  In  Section  4 we 
demonstrated  how  properly  written  axioms  can  be  used  in  formal  program  verification. 
Abstract  data  types  provide  a mechanism  for  factoring  proofs  into  manageable  sections. 
At  one  level  of  abstraction  the  axiomatic  specification  provides  us  with  theorems  that  may 
be  applied  in  the  verification  of  programs  that  use  abstract  data  types.  We  have 
illustrated  how  such  axioms  may  be  used  to  verify  the  correctness  of  implementations  of 
higher-level  abstract  data  types  in  terms  of  lower-level  ones.  Writing  the  axioms  in  a 
certain  style  allows  them  to  be  used  as  rewrite  rules  so  that  the  proofs  can  in  large  part 
be  carried  out  as  a sequence  of  reductions.  The  feasibility  of  automating  such  proofs  has 
been  demonstrated  with  a prototype  interactive  verification  system  for  data  types. 

It  is  important  to  notice  that  the  techniques  developed  in  this  paper  are  essentially 
programming  language  independent.  While  the  availability  of  languages  with  the  compile 
time  type  facilities  of  SIMULA  67  [Dahl  68],  CLU  [Liskov  74],  or  Alphard  [Wulf  76]  will  make 
these  techniques  easier  to  apply,  they  are  by  no  means  essential.  If  one  exercises  enough 
self  discipline  to  ensure  the  validity  of  data  type  induction,  the  techniques  described 
should  prove  useful  in  the  development  of  programs  in  any  language.  The  ultimate  test  of 
any  research  designed  to  enhance  the  process  of  software  development  must  lie  in  the 
application  of  the  results  of  that  research  to  a large  software  project.  We  feel  that  the 
work  described  in  this  paper  lays  the  groundwork  necessary  for  such  a test. 


Acknowledgements 

We  would  like  to  thank  Dick  Jenks,  Ralph  London,  Nancy  Lynch  Mark  Moriconi,  Jernej 
Polajnar,  Dave  Wile,  and  Marty  Yonke  for  their  careful  reading  of  earlier  drafts  of  this 
paper,  and  Betty  Randall  for  her  help  in  typing  the  manuscript. 


References 


[Boyer 75]  R.  S.  Boyer  and  J S.  Moore,  "Proving  theorems  about  LISP  functions," 
J.  ACM,  22,  1,  January  1975,  129-144. 

[Dahl68]  O.-J.  Dahl,  The  SIMULA  67  common  hose  language , Norwegian  Computing 
Center,  Oslo,  1968. 


[Goguen75]  J.  A.  Goguen,  J.  W.  Thatcher,  E.  G.  Wagner,  and  J.  8.  Wright,  "Abstract 
data-types  as  initial  algebras  and  correctness  of  data  representations,  Proceedings, 
Conference  on  Computer  Graphics,  Pattern  Recognition  and  Data  Structure, 
May  1975. 

[Good75]  D.  I.  Good,  R.  L.  London,  and  W.  W.  Bledsoe,  "An  interactive  prog-am 
verification  system,"  IEEE  Transactions  on  Software  Engineering,  SE-1,  1, 
March  1975,56-67. 

[Guttag75]  J.  V.  Guttag,  The  specification  and  application  to  programming  of  abstract 
data  types,”  Ph.  D.  Thesis,  University  of  Toronto,  Department  of  Computer  Science, 
1975,  available  as  Computer  System  Research  Report  CSRG-59. 

[Guttag76a]  J.  V.  Guttag,  "Abstract  data  types  and  the  development  of  data 
structures,"  Supplement  to  the  Proceedings  of  the  SIGPLAN/SIGMOD  Conference  on 
Data:  Abstraction,  Definition,  and  Structure,  March  1976,  pp.  37-46. 

[Guttag76b]  J.  V.  Guttag,  E.  Horowitz,  D.  Musser,  "The  design  of  data  type, 
specifications,"  USC  Information  Sciences  Institute  Research  Report,  Marina  del  Rey, 
Ca.,  1976  (to  appear  in  Proc.  Second  International  Conference  on  Software 
Engineering  , San  Francisco,  October  1976). 

[Guttag76c]  J.  V.  Guttag,  E.  Horowitz,  and  D.  R.  Musser,  "Abstract  data  types  and 
software  validation,"  USC  information  Sciences  Institute  Research  Report,  Marina 
del  Rel,  Ca„  1976. 

[Hoare72]  C.  A.  R.  Hoare,  "Proof  of  correctness  of  data  representations.'TIcta 

Information,  4,  1972,  pp.  271-281. 

[Horowitz76]  E.  Horowitz  and  S.  Sahni,  Fundamentals  of  Data  Structures , Computer 
Science  Press,  June  1976. 

[Liskov74]  B.  K Liskov  and  S.  N.  Zilles,  "Programming  with  abstract  data  iypes," 

Proceedings  of  ACM  SIGPLAN  Symposium  on  Very  High  Level  Languages, 
SIGPLAN  Notices,  9,4,  April  1974,  50-59. 

[Liskov75]  B.  H.  Liskov  and  S.  N.  Zilles,  "Specification  techniques  for  data 

abstractions,"  IEEE  Transactions  on  Sojtware  Engineering,  Vol.  SE-1,  No.  1, 
March  1975,  pp.  7-18. 

[Manna74]  Z.  Manna,  Mathematical  Theory  of  Computation,  McGraw-Hill,  1974. 

[McCarthy63]  J.  McCarthy,  "Basis  for  a mathematical  theory  of  computation,"  in 

Computer  Programming  and  Formal  Systems,  P.  Braffort  and 
D.  Hirchberg  (eds.),  North-Holland  Publishing  Company,  1963,  pp.  33-70. 

[Parna *72]  D.  L.  Parnas,  "Information  distribution  aspects  of  design  methodology,” 

Proc.  IFIP  Congress  71,  Vol.  1 (1972),  pp.  339-344. 

(Rosen73]  B.  Rosen,  "Tree  Manipulating  systems  and  Church-Rosser  theorems,"  J. 
ACM,  20,1,  January  1973,  160-187. 


390 


[Spitzen75]  J.  Spitzen  and  B.  Wegbreit,  "The  verification  and  synthesis  of  data 

structures,”  Acta  Informatica  4,  127-144  (1975). 

[Standish73]  T.  A.  Standish,  "Data  structures:  an  axiomatic  approach,"  BBN  Report 

No.  2639,  Bolt  Beranek  and  Newmann,  Cambridge,  Mass.,  1973. 

[Wegbreit76]  B.  Wegbreit  and  J.  Spitzen,  "Proving  properties  of  complex  data 

structures,"  J.  ACM,  23,  2,  April  1976,  389-396. 

[Wulf76]  W.  A.  Wulf,  R.  L.  London,  and  M.  Shaw,  "Abstraction  and  verification  in 
Alphard:  introduction  to  language  and  methodology,"  Carnecie-Mellon  University 
and  USC  Information  Sciences  Institute  Technical  Reports , 1976. 

[Zilles75]  S.  N.  Zilles,  "Abstract  spec.fications  tor  da*a  types," 

Laboratory,  San  Jose,  California,  1975 


IBM  Research 


STANFORD  VERIFIER 


AN  ON-LINE  SESSION  WITH  THE  STANFORD  VERIFIER 

by 

David  Luckham 


The  following  is  a transcript  with  comments  of  an  on-line  session  with  the  Stanford 
verifier.  A program  was  suggested  at  the  beginning  of  the  session  by  a visitor  to  the 
project.  His  program  was  simpler  than  most  examples  currently  being  experimented 
with  using  this  verifier,  but  being  a surprise,  it  enabled  him  to  see  some  of  what  goes 
on  in  the  process  of  verifying  a new  program.  The  session  includes  examples  of  both 
verification  and  debugging  using  this  kind  of  system. 

Comments  are  in  lower  case,  commands  typed  by  the  user  are  prefixed  by  'V. 
Responses  from  the  verifier  are  prefixed  or  headed  by  *’s. 

We  started  with  the  program  below  with  ENTRY/EXIT  specifications  stating  that  its 
output  array  is  a permutation  of  its  input;  an  INVARIANT  assertion  was  included  for  each 
of  the  WHILE  loops.  PERMUTATIONS, B,L,R)  means  that  array  A is  a permutation  of  B 
within  the  range  of  indices  L to  R. 


PASCAL 

TYPE  NARRAY=ARRAY[1..N]  OF  INTEGER; 

PROCEDURE  SORT  (VAR  A:NARRAY); 

INITIAL  A=A0; 

EXIT  PERMUTATIONS, AO, 1,N); 

VAR  TEMP:INTEGER; 

VAR  l,J:INTEGER; 

BEGIN 
l:=  1 ; J:-  1 ; 

INVARIANT  PERMUT ATiON{ A,A0, 1 ,N; 

WHILE  (l<N)  DO 
BEGIN 

INVARIANT  PERMUTATIONS, AO,  1 ,N) 

WHILE  <J<N-I)  DO 
BEGIN 

IF  A[J]>A[J+1]  THEN  BEGIN  TEMP:=A[J];  A[J]:=A[J+1];  A[J+1]:=TEMP  END; 
J:=J+1 
END; 
l:=l+l; 

END 
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The  program  was  typed  into  a disk  file  called  EX. 

»R  VERIFY  120  call  the  verifier  with  60k  core. 

«(PARSE  EX)  verify  the  program  in  file  EX. 

The  first  two  results  were  syntax  error  messages  due  [1..N]  instead  of  [1:N]  (not 
standard  Pascal),  and  a missing  END.  We  omit  these  messages. 

The  corrected  program  was: 

PASCAL 

TYPE  NARRAY=ARRAY[  1 :N]  OF  INTEGER; 

PROCEDURE  SORT  (VAR  A:NARRAY); 

INITIAL  A=A0; 

EXIT  PERMUTATION* A(A0,1,N); 

VAR  TEMPJNTEGER; 

VAR  l,J:INTEGER; 

BEGIN 
!:=1;  J:=l; 

INVARIANT  PERMUT AT!0N( A, AO,  1 ,N) 

WH.LE  (l<N)  DO 
BEGIN 

INVARIANT  PERMUTATIONS, AO, 1,N) 

WHILE  (J<N-I)  DO 
BEGIN 

IF  A[J]>A[J+1]  THEN  BEGIN  TEMP;=A[J];  A[J]:=A[J+1];  A[J+1]:=TEMP  END; 

J:=J  + 1 
END; 
l:-l+l; 

J:=l 

END 

END;.; 


comment:  The  following  notation  is  used  to  express  the  result  of  operations  on  arrays; 
<A,[I],E>  represents  the  array  obtained  from  A by  placing  E in  the  1th  position. 
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SPARSE  EX) 

The  resulting  four  verification  conditions  were: 


******************************  ************* 
VERIFICATION  CONDITION:  SORT  1 


*4  *******  ********************************** 


(TRUE  corresponds  to  the  path 

-*  from  ENTRY  to  outer  WHILE 

PERMUT ATION( A,A,  1 ,N)  & to  EXIT. 

H00<N  K 

PERMUTATION  A00,A,1,N) 

-» 

PERMUT  ATION(AOOpA,  1 ,N))) 
******************************************* 

VERIFICATION  CONDITION:  SORT  2 


******************************************* 


(KN  & 

PERMUTATI0N(A,A0,1,N)  corresponds  to  the  path 

-*  round  the  outer  loop  to 

PERMUTATION  A, AO,  1,N)  & inner  WHILE  to  EXIT. 

<-J02<N-l  & 

PERMUTATION  A02, AO,  l.N) 

PERMUTATION  A02, AO,  1 ,N») 
******************************************* 

VERIFICATION  CONDITION;  SORT  3 


******************************************* 


(•’AfJ+1  ]<A[J]  & 

J<N-I  K 

PERMUTATION  A,A0,1,N) 


corresponds  to  the  path 
round  the  inner  loop 
when  the  IF  test  is  false. 
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PERMUTATIONS, AO, 1,N)) 
******************************************* 

VERIFICATION  CONDITION:  SORT  4 
******************************************* 

(A[J+1]<A[J]  & corresponds  to  path 

J<N-I  & round  inner  loop  when 

PERMUTATIONS, AO,  1,N)  the  IF  test  is  true. 

-4 

PERMUTATION(«A,[J],A[J+1]>,[(J+1)],A[J]>,AO,1,N)) 

*******************  ****•***•*•*****•*********+* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  1 
******************************************* 

{TRUE 

■4 

PERMUTATION  A,  A,  1,N)> 

******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  2 
******************************************* 

TRUE 


******************************************* 
SIMPLIFIED  VERIFICATION  CONDITION:  SORT  3 
******************************************* 
TRUE 
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******************************************* 
SIMPLIFIED  VERIFICATION  CONDITION:  SORT  4 
******************************************* 


(A[J+1  ]<A[J]  & 

PERMUTATION A,AO,l,N)  & 

J+lsN 

PERMUTATION<<<  A,[J],A[J+ 1 ]>,[<  J+ i >].A[J]>,AO,  1 ,N)> 
***** 

TIME:  0 CPU  SECS,  1 REAL  SECS 


comment:  The  simplification  has  been  made  without  any  non  logical  knowledge  (e.g.  the 
verifier  has  not  been  given  any  facts  about  the  concept  PERMUTATI0NA,B,1,N).  The 
simplified  VC  SORT  1 shows  that  it  doesn’t  even  know  PERMUTATION A,A,1,N). 

Notice  the  notation  for  the  array  manipulation  in  the  conclusion  of  VC  SORT  4.  In  more 
complicated  cases  it  can  get  pretty  unreadable.  Is  a more  readable  notation  possible  (or 
needed)? 

Our  visitor  now  asked  if  we  had  a file  of  lemmas  describing  standard  concepts  used  in 
verifying  sorting  programs.  So  we  now  read  in  a file  of  standard  lemmas  used  for 
versions  of  QUICKSORT.  The  file  contained  lemmas  about  many  concepts  other  than 
PERMUTATION. 

#(PARSE  (QUICK.GO))  read  in  the  lemmas. 

****** 

GOALFILE  ADDED  2 CPU  SECONDS 

The  set  of  lemmas  added  is  below  (the  relevant  parts  are  those  dealing  with 
PERMUTATION  and  EXCHANGE).  EXCHANGE  defines  a shorthand  for  the  kind  of  array 
transformation  expression  that  appeared  above  in  VC  SORT  4. 
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GOALFILE  % Goals  and  axioms  to  verify  quicksort.  Also  needs  ARITH.GO  7. 

7.  ISSORTEDARRAY(A,B.L,R)  asserts  that  array  A is  a sorted  form  of  array  B in  the 
interval  [L,R]  V. 

AXIOM  ISSORTEDARRAYOF(®A,@B,eL,ffiR)  « PERMUT ATION( A,B>L,R)  a 
ORDERED(A,L,R); 

7.  A is  B sorted  on  [L,R]  « it  is  a permutation  of  B and  is  ordered  '/. 

7.  PERMUTATlON(A,B,L,R)  asserts  that  array  A is  a permutation  of  array  B in  the 
interval  [L,R]  7 

AXIOM  PERMUT ATIONtffiA,©A,@L,®R>  - TRUE; 

% An  array  is  a permutation  of  itself  % 

AXIOM  IF  <L<P1)  a (PI <R;  a *L<P2)  a (P2SR>  THEN 

PERMUTATION(EXCHANGE(©A,©Pl,ffoP2),©A.®l-1©R)  « TRUE; 

7.  Exchanging  two  elements  is  a valid  permutation  7. 

GOAL  PERMUTATION*  ©A, ©B,©L,©R)  SUB  PERMUTATION(A,eC,L,R)  a 
PERMUTATION* ©C,B,L,R); 

7.  Transitivity  of  the  permutation  relation  7. 

GOAL  PERMUTATION*  ©A, @G,©L,©R)  SU3  PERMUTATION  A,  B,ffiLl,eRl)  a 
(L<©L1)  a *©R1<R;  A PRESERVEEXCEPT*  A,B,©L  1 ,©R  1 ); 

7.  Permuting  a subarray  of  A between  LI  and  R1  and  leaving  the  rest 
[ unchanged  is  a valid  permutation  7. 

% EXCHANGE(A,P  1 ,P2)  is  array  A with  the  elements  in  positions  PI  and  P2  exchanged  7. 

AXIOM  IF  Y=B[P2]  THEN 

<<©B,[s?Pl],@Y>,[Co>P2],©X>  * EXCHANGE(<B,[P  1 ],X>,P  1 ,P2); 

7.  ORDERED(A,L,R)  asserts  that  array  A is  in  ascending  order  on  the  interval  [L,R]  7. 

AXIOM  IF  R<L  THEN  ORDERED(©A,@L,®R)  « TRUE; 

7.  An  empty  interval  is  ordered  7. 

GOAL  ORDERED*  ©A,©L,©R)  SUB  ORDERED*  A,L,®P-1)  a ORDERED* A, ©P+l.R)  a 
BIGGEST*A[©P],A,L,©P-1)  a SMALLEST(A[©P],A,©P+1.R)  a <L<©P)  a 
(ffiP<R); 

7.  Array  A is  ordered  on  [L,R]  if  A[P]  partitions  the  elements  of  A 
with  those  bigger  in  the  interval  [P+l.R]  and  those  smaller  in  the 
interval  [L.P-ll,  and  each  subarrey  is  ordered  1 
GOAL  ORDERED* ©A,©L,©R)  SUB  ORDERED* ©B,L,R)  a PRESERVEEXCEPT(A,©B,^L1,©R1 ) 
a *(©R1<L)  v (R<©L1/); 

7.  Array  A remains  ordered  on  [L,R]  if  it  was  ordered  and  nothing  in 
the  interval  [L,R]  has  been  changed  1 

7.  BIGGEST(X,A,L,R)  asserts  that  X is  > anything  in  array  A in  the  interval  [L,R]  7. 

AXIOM  IF  L>R  THEN  BIGGEST ( ©X,®  A,©L,©R)  *•  TRUE; 

7.  X is  bigger  than  anything  in  an  empty  interval  % 

GOAL  BIGGEST* ©X,©A,ffiL,©R)  SUB  BIGGEST(X,A,L,R-1)  a *A[R]<X>; 
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7.  X is  BIGGEST  on  [L,R]  if  it  is  BIGGEST  on  [L,R-1]  and  > A[R]  7. 

GOAL  BIGGEST(®'A[s'M  1 ],@A,?vL,©R)  SUB  BIGGEST*®B[Ml],i6>e,L,R)  a 
*(®R1<L)  v (R<®L1))  a *(©R1<M1)  v (M1<@L1))  A 
PRESERVEEXCEPT*  A,®S,s>L  1 ,©R  1 ); 

% A[M1]  remains  BIGGEST  on  [L,R]  if  no  changes  have  been  made  to 
either  A[M1]  or  any  A[l]  in  the  interval  [L,R]  7. 

GOAL  BIGGEST(®X<@A,[®R1],©Y>,®L,®R)  SUB  BIGGEST(X,A,L,R)  a (R1>R); 

7-  X remains  BIGGEST  on  [L,R]  if  the  change  to  A occurs  to  the  right 
of  the  interval  % 

GOAL  BIGGEST(®A01[®X],®AC  1 ,®L,®R) 

SUB  PRESERVEEXCEPT(A01,®A,L,R)  a 
BIGGEST(A[X],A,L,R)  a 
PERMUTATION*  AO  1, A, L,R)a 
(X>R); 

'/.If  A[X]  is  the  biggest  elt.  of  A in  the  range  [L,R],and  X is  outside 
the  range,  and  A01  results  by  permuting  this  range,  then  A01  [X]  will 
remain  the  biggest  elt  of  A01  in  the  range  [L,R]7. 

7.  SMALLEST(X,A,L,R)  asserts  that  X is  < anything  in  array  A in  the  interval  [L,R]  7. 

AXIOM  IF  L>R  THEN  SMALLEST*  ®X,®  A, ©L,®R)  « TRUE; 

7.  X is  smaller  than  anything  in  an  emoty  interval  7. 

GOAL  SMALLEST*®X,®A,®L,®R>  SUB  SMALLEST*X,A,L+1,R)  A <X<A[L]); 

7.  X is  SMALLEST  on  [L,R]  if  it  is  SMALLEST  on  [L+1,R]  and  X<A[L]  7. 

GOAL  SMALLEST(®A[®M1],®A,®L,®R)  SUB  SMALLEST(©B[M1],©B,L,R)  a 
(*©R1<L)  v *R<®L1»  a «pR1<M1)  v *M1<@L1))  a 
PRESERVEEXCEPT*  A,©B,@L  1 ,®R  1 ); 

7.  A[M1]  is  SMALLEST  on  [L,R]  if  it  was  before  and  neither  A[M1]  nor 
anything  in  [L,R]  has  been  changed  % 

GOAL  SMALLEST* oX.fflA.CL.fiR)  SUB  SMALLEST(X,A,©L1,R)  a <©L1<L); 

7.  SMALLEST  of  a subfile  is  true  7. 

GOAL  SMALLEST<ffiX,<@A,[©Ll],®Y>,©L,©R)  SUB  SMALLEST(X,A,®L2,R)a(L1<L)a(©L2<L> 
7.  X remains  smallest  on  [L,Rj  i'  it  was  on  a larger  interval  and  the 
change  to  A occurred  to  the  left  of  L 7. 

GOAL  SMALLEST(©A00[©X],®A00,®L,©R) 

SUB  SMALLEST(®A[X],®A,L,R)  a 
PERMUT ATION* A00,A,L,R)  a 
PRESERVEEXCEPT*AOO,A,L,R)  A 
*X<L>; 

7A00[X]  remains  the  smallest  elt.  in  the  range  [L,R]  when 
AOO  is  obtained  from  A by  permuting  [L,R]  and  X is  outside 
this  range.7 


7 PRESERVEEXCEPT(A,B,L,R)  asserts  that  any  differences  between  arrays  A and  B occur 
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IN  the  interval  [L,R]  */. 

AXIOM  PRESERVEEXCEPT ( ® A,@A,©L,@R)  « TRUE; 

% If  nothing  changes,  everything  is  preserved  7. 

AXIOM  IF  <L<P)  a (P<R)  THEN  PRESERVEEXCEPT(<oA,[@P]1@X>,®AI©L,©R)  « TRUE; 

1 Changes  within  [L,R]  preserve  things  outside  of  [L,R]  7. 

GOAL  PRESERVEEXCEPT(EXCHANGE(®A,rn'Rl,©Ll),©B,@L,©R)  SUB  PRESERVEEXCEPT ( A,E,L, 
R)  a (L<L1)  a *L<R1)  a <L1<R)  A (R1<R); 

% Exhanges  within  [L,R]  preserve  things  outside  of  [L,R]  7. 

GOAL  PRESERVEEXCEPT* joA,pB, ©L, AR)  SUB  PRESERVEEXCEPT*  A, B,©L  1 ,@R  1 ) a 
(L<®L1)  a *©R1<R); 

7.  Nothing  changes  outside  of  [L,R]  if  nothing  changes  outside  of  a 
subinterval  of  [L,R]  7. 

GOAL  PRESERVEEXCEPT(eAleB1<»LlffiR)  SUB  PRESERVEEXCEPT(A,eClLlR)  a 
PRESERVEEXCEPT*  jssC,B,L,R); 

7.  Transitivity  of  PRESERVEEXCEPT  X 

7.  These  are  some  inequality  rules  that  the  verifier  could  not  deduce  from  ARITH.GO  7. 

GOAL  ©X<®X  SUB  TRUE; 

GOAL  ©X>©X  SUB  TRUE; 

GOAL  ©X>( R>X-1  SUB  TRUE; 

GOAL  ©X<®X+I  SUB  TRUE; 

GOAL  ©X-l<©2  SUB  *©X<©Y)  a *©Y<©2)  j 
GOAL  ®X<@Y+1  SUB  X<Y; 

GOAL  ©X<©Y  SUB  X<Y; 

7.  Tell  it  what  an  v is7. 

GOAL  ©A  v ©B  SUB  A,B; 

•J 


399 


STAMFORD  VERIFIER 


u(  SIMPLIFY  SORT  1) 


Now  try  to  prove  VC  SORT  1. 


******************************************* 
SIMPLIFIED  VERIFICATION  CONDITION:  SORT  1 
******************************************* 
TRUE 

*(  SIMPLIFY  SORT  4) 

**************** **************** *********** 
SIMPLIFIED  VERIFICATION  CONDITION:  SORT  4 
******************************************* 


(A[J+1  ]<A[J]  & 

J+I<N 

-* 

PERMUTATlOM(EXCHANGE(A,J,J+ 1 ),A,  1 ,N)) 

***** 

TIME:  0 CPU  SECS,  0 REAL  SECS 

comment:  Notice  that  the  array  manipulation  expression  has  been  replaced  by  the 
convenient  EXCHANGE  notation.  The  replacement  rules  permit  the  user  to  define  his 
own  notation. 

VC  SORT  4 was  not  proved  because  the  relevant  lemma  requires  facts  about  the  range 
bounds  (L  and  R in  the  second  AXIOM  in  the  PERMUTATION  lemmas).  In  the  simple 
program  here  these  bounds  arc  fixed  at  1 and  N,  so  a special  case  of  this  lemma  will 
suffice. 
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Wo  now  add  a simple  extra  lemma  on-line: 

«(PARSE) 

oGOALFILE 

*R  1 :INFER  PERMUTATION  EXCHANGE®  A,©J,@J+ 1 ),©A,  1 ,N)>;.; 
****** 

GOALFILE  ADDED  0 CPU  SECONDS 

The  rule  typed  in  on-line  has  been  named  Rl. 

^(SIMPLIFY  SORT  4) 

******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  4 
******************************************* 

USING  RULES  Rl, 

TRUE 
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Next  we  added  a bug  suggested  by  the  visitor  (the  bug  is  in  the  line  marked  by  #): 
PASCAL 

TYPE  NARRAY=ARRAY[  1 :N]  OF  INTEGER; 

PROCEDURE  SORT  (VAR  A:NARRAY); 

INITIAL  A-AO; 

EXIT  PERMUTATIONS A,A0,1,N); 

VAR  TEMP:INTEGER; 

VAR  I.JilNTEGER; 

BEGIN 
l:=  1 ; J:=l; 

INVARIANT  PERMUTATION A,A0, 1 ,N) 

WHILE  (l<N)  DO 
BEGIN 

INVARIANT  PERMUT ATlON( A, AO,  1 ,N; 

WHILE  (J<N-I)  DO 
BEGIN 

« IF  A[J]>A[J+1]  THEN  BEGIN  TEMP;=A[J];  A[J]:=A[J+1];  A[J]:=TEMP  END; 
J;=J+1 
END; 
l:-l+l; 

END 

END;.; 
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«(PARSE  EX)  where  EX  now  contains  the  buggy  program  led  to  the  result: 
******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  1 
******************************************* 

TRUE 

******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  2 
******************************************* 

TRUE 

******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  3 
******************************************* 

TRUE 


******************************************* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  A 
******************************************* 

TRUE 

***** 

TIME:  1 CPU  SECS,  1 REAL  SECS 

comment:  very  interesting!  This  kind  of  thing  always  happens  in  demonstrations. 
However,  if  you  look  back  at  the  buggy  program  you  will  see  that  the  "bug"  does  not 
invalidate  the  PERMUTATION  property;  it  affects  the  ORDEREDness. 
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So  we  tried  another  bug.  The  result  was  this: 

***********************  *********  ********  +.4.* 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  1 
******************************************* 
TRUE 

t'*************************************;!.**** 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  2 

******************************************* 

TRUE 

*****  ************************************** 

SIMPLIFIED  VERIFICATION  CONDITION:  SORT  3 

******************************************* 

TRUE 
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SIMPLIFIED  VERIFICATION  CONDITION:  SORT  4 
*********************  ********************** 


<A[J+1  ]<A[J]  & 

J+I<N 

— ♦ 

PERMUT  ATlON(<A,[J],A[J+ 1 ]>,A,  1 ,N)> 
***** 

TIME:  1 CPU  SECS,  2 REAL  SECS 


comment:  The  only  change  made  has  been  to  the  code.  The  first  three  VC’s  are  proved. 
By  looking  at  VC  SORT  4 can  you  tell  whet  the  bug  is,  and  where  it  is?  You  can  figure 
out  the  error  generated  by  the  bug  from  the  array  expression  <A,[J],A[J  + 1 ]>.  The 
boolean  tests  determining  the  path  on  which  the  bug  occurs  are  in  the  premisses  of  this 
VC. 


I 


STANFORD  VERIFIER 


Here’s  the  second  buggy  program. 

PASCAL 

TYPE  NARRAY=ARRAY[  1 :N]  OF  INTEGER; 

PROCEDURE  SORT  (VAR  A:NARRAY); 

INITIAL  A-  AO; 

EXIT  PERMUTATION  A, AO,  1,N); 

VAR  TEMP:INTEGER; 

VAR  l,J:INTEGER; 

BEGIN 
l:=l;  J:=li 

INVARIANT  PERMUTATION  A,  AO,  1,N) 

WHILE  (l<N)  DO 
BEGIN 

INVARIANT  PERMUTATION  A,  AO,  1 ,N) 

WHILE  (J<N-I)  DO 
BEGIN 

IF  A[J]>A[J+1]  THEN  BEGIN  TEMP:=A[J];  A[J]:«A[J+1]  END; 

J:=J+1 

END; 

M+l; 

J;=l 

END 

END;.; 

The  whole  session  including  my  one  finger  typing  of  the  program  etc.  took  about 
twenty  minutes  of  console  time. 

We  are  pursuing  three  areas  of  experimental  research  with  this  verifier  at  present. 
(1)  The  verifier  is  currently  being  developed  to  exploit  its  use  as  a debugging  aid.  (2) 
We  are  experimenting  with  classes  of  more  complex  programs  (including  pointer 
manipulations)  and  with  methods  of  verifying  larger  programs  than  have  been  done  so 
far.  (3)  The  standardization  of  verifications  (i.e.  the  required  lemmas)  so  that  changes  in 
code  can  be  checked  quickly  using  a verification  of  a previous  version  of  the  program  is 
being  attempted  for  several  kinds  of  program.  This  should  lead  to  applications  to  the 
software  maintainence  problem.  Further  information  on  this  project  can  be  found  in  the 
Stanford  Artificial  Intelligence  Laboratory  reports,  Stanford  University  (or  by  contacting 
the  author). 

ACKNOWLEDGEMENTS  The  goalfilo  is  taken  from  Scott  Drysdalc’s  work  on  debugging 
and  verifying  six  versions  of  QUICKSORT.  The  visitor  was  Bill  Carlson  of  ARPA. 
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I n t r oduct ion 


Gypsy  is  the  unifying  element  of  a 
complete  methodology  for  the  construction  of 
rigorously  verified  systems  programs.  This 
methodology  provides  an  integrated  way  of 
specifying,  designing,  and  implementing  programs 
and  of  verifying  that  they  always  execute  in 
conformity  with  their  stated  specifications, 
even  in  imperfect  execution  environments.  The 
Gypsy  language  provides  a precise  means  of 
expressing  both  a program  and  its  speci f ica t ions 
from  initial  specification  through  program 
design,  implementation,  verification,  and 
successive  modification  and  evolution.  This 
integration  of  programming  and  speci f icat ion 
facilities  into  a common  language  is  the  most 
significant  single  character istic  of  Gypsy. 
This  requires  the  program  designer  and  verifier 
to  comprehend  only  a single  syntax  and  semantics 
throughout  program  development.  This  also 
allows  program  proofs  to  be  constructed 
rigorously  throughout  ali  stages  of  development, 
thereby  bringing  maximal  benefit  to  the  total 
programming  process. 

The  incorporation  of  specifications  and 
programming  facilities  into  a single  language 
provides  three  complementary  approaches  to 
program  verification.  First,  formal  proofs  that 
the  program  meets  specifications  can  be 
constructed  before  any  execution  of  the  pregram 
occurs,  second,  speci f icat ions  can  he  validated 
by  actual  evaluation  at  run-time.  Third,  trace 
facilities  provide  a convenient  mechanism  for 
post-execution  analysis  if  desired.  This 
establishes  a very  effective  relationship 
between  program  proof  and  run-time  validation  of 
speci f ications.  Those  speci f icat ions  that  are 
val  idated  at  run-time  need  not  be  proved,  but 
can  be  assumed  in  the  proofs,  thus  reducing 
6 igr  if  leant ly  the  sire  and  complexity  of  the 
formal  proofs.  This,  of  course,  increases 
program  execution  time,  but  provides  the  program 
designer  and  verifier  with  a choice  of 
implementation  and  verification  strategy. 

One  of  the  common,  and  often  unstated, 
assumptions  of  formal  program  proofs  is  that  of 
a perfect  execution  environment.  For  example, 
the  problems  of  arithmetic  on  a finite  set  of 
integers  often  are  ignored.  Also  it  invariably 
is  assumed  that  if  an  assignment  x:*a  is 
executed,  then  a successive  reference  to  the 
value  of  x will  equal  a.  In  reality,  this 
normally  would  imply  that  a computer  memory 
never  will  drop  a bit  in  the  word(s)  where  x is 
stored.  In  Gypsy  the  conceptual  span  of  both 
specifications  and  program  code  hoc  been 
extended  to  include  program  execution  in 
imperfect  environments.  Speci f icat ions  and 
program  code  concerning  data  integrity,  error 
monitoring,  and  error  isolation  and  recovery  aio 
expressed  in  a unified  form  along  with  the* 
error-five  environment  statements. 


In  theory,  methods  cf  program  proof  can  be 
applied  to  programs  of  any  size.  In  practice, 
however,  the  provability  of  a program  depends 
directly  on  the  degree  to  which  a program  can  be 
decomposed  into  small,  1 ndependent 1 y provable 
units.  Gypsy  supports  thi6  kind  of 
decomposition  through  facilities  for  both 
operat iona 1 and  data  abstraction.  The  data 
abstraction  is  provided  through  a general 
mechanism  for  access  control.  These  abstraction 
facilities  can  be  applied  uniformly  either  to 
programs  or  to  speci f icat ions , and  provide  an 
effective  basis  for  incremental  design  and 
verification  of  a complete  program.  Gypsy 
further  supports  this  process  by  providing 
explicit  facilities  for  top-down  development. 
This  permits  higher-level  units  to  be  designed 
and  verified  even  though  some  of  the  lower  level 
details  of  their  implementations  may  still  be 
pending. 

The  original  target  of  Gypsy  was  the 
expression  of  verifiable  programs  for 
communications  processing  such  as  those  that 
might  be  found  at  the  node  of  a computer 
network.  This  led  to  the  incorporation  of 
verifiable  features  for  expressing  concurrency 
and  process  synchronization  and  for  expressing 
real-time  dependencies.  The  result  has  been  a 
high-level  language  for  the  development  of 
general  systems  programs  that  can  be  verified  to 
execute  in  conformity  with  precisely  stated 
specif ications . 


Design  of  Gypsy 


Gypsy  was  developed  as  an  integrated 
programming  and  speci f icat ion  language  to 
support  the  complete  design,  specification, 
coding  and  verification  of  systems  software, 
with  particular  emphasis  on  communica t ions 
software.  Specific  goals  were: 

* Complete  Verifiability.  Every  feature  in 
the  language  must  be  rigorously 
vec if iable . 

* Incremental  Development.  The  language 
must  support  modular,  incremental  program 
development  and  verification.  As  best 
possible  the  language  must  simplify  the 
verification  process  by  encouraging  small 
modules  with  tightly  regulated 
interactions  and  by  isolating  and 
minimizing  the  effects  of  modifications  to 
previously  verified  code.  There  must  also 
be  a facility  for  partial  expression  of 
program  units. 

* Systems  Programming,  The  language  must 
support  the  development  of  systems 
software.  There  must  be  facilities  for 
expressing  process  concurrency  and 
synchron iz i ng  process  communicat ion. 
There  must  also  be  facilities  for 
expressing  real-time  dependencies. 

* imperfect  Execution  Environments.  The 
language  must  support  execution  in 
imperfect  environments.  It  must  be 
possible  to  detect,  isolate,  and  recover 
from  run-time  anomalies  as  well  as  monitor 
the  program  state, 

* Specification  Capability.  The  language 
must  provide  an  extensive  specification 
capability.  For  every  property  that  is  to 
be  verified,  there  must  be  an  adequate 
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««ans  of  expressing  It  directly  In  the 
language.  The  Integration  of  formal 
proof,  run-time  validation,  and  monitoring 
■uat  b*  conaiatant  and  provide  a complete 
whole. 

The  detlqn  of  Gypay  has  been  a combined 
procaaa  of  synthesis  and  contraction.  Starting 
from  Pascal  (121  each  existing  Pascal 
construct  was  carefully  analyzed  and  those  which 
Inhibited  verification  were  modified  or  removed. 
The  hierarchical  definition  structure  wss 
eliminated  and  protection  lists  were  added  to 
provide  a tighter,  more  flexible  environment  for 
incremental  program  development  and 
verification.  Facilities  for  expressing 
concurrency,  communication,  synchronization, 
timing  constraints,  external  events,  error 
recovery,  and  monitoring  were  added,  paying 
close  attention  to  the  requirements  of  the 
verification  methodology.  Each  construct  in  the 
program  code  and  the  specification  statements 
was  designed  to  support  the  verification 
methodology.  The  program  code  syntax  was 
modified  to  Integrate  these  specification 
statements  into  a logically  consistent  and 
hopefully  understandable  language. 


Designing  for  Verification 


A language  which  is  to  facilitate  coding 
and  specification  must  not  only  include 
capabilities  necessary  for  expressing  the 
problem  domain  of  interest,  but  must  exclude 
language  constructs  whose  semantics  defeat,  or 
impede,  verification.  We  defer  a discussion  of 
Gypsy's  specification  statements  until  a later 
section  for  pedagogical  reasons.  Their 
development  was,  however,  closely  interwoven 
with  that  bf  the  coding  statements. 

Verification  of  program  code  has  only 
recently  become  a prominent  factor  in 
programming  language  design.  While  Pascal  was 
Influenced  by  verification  considerations 
(S|,  more  recently  Nucleus  [101  and  Alphard 
(21|  have  been  expressly  designed  for 
verification  by  formal  proofs.  Gypsy  also  is 
specifically  designed  for  verification,  but 
verification  by  run-time  validation  as  well  as 
by  formal  proof.  The  first  phase  in  the  design 
of  Gypsy  was  to  develop  a "conventional" 
language  which  was  free  from  concepts  known  to 
render  formal  proof  verification  difficult.  To 
this  end,  Pascal  (121  was  selected  as  a model 
and  Gypsy  was  patterned  after  Pascal,  but  with 
significant  differences. 

Routines  in  Pascal  can  be  nested  to 
arbitrary  depths  which  creates  a hierarchy  of 
nested  "non-local"  variables.  Routines  in  Gypsy 
may  not  be  nested  and  variables  can  only  be 
defined  within  routines;  hence,  Gypsy  has  no 
non-local  variables.  This  simplifies 
verification  as  well  as  Incremental  program 
development,  which  will  be  discussed  in  the  next 
section. 

Functions  in  Pascal  can  take  either 
variable  or  value  parameters  and  can  only  return 
values  of  a simple  type.  In  Gypsy,  functions 
are  allowed  constant  and  value  parameters  and 
they  can  return  values  of  any  type.  The 
restriction  to  these  kinds  of  parameters, 
together  with  the  absence  of  non-local 
variables,  guarantees  that  functions  produce  no 
side-effects.  This  simplifier,  verification 
considerably. 


Paacal  allows  routines  to  be  included  as 
parameters  to  othar  routlnesi  Gypsy  does  not. 
This  decision  wss  made  not  because  of  the 
necessity  to  do  dynamic  type  checking,  but 
because  the  extre  burden  on  the  verification 
process  did  not  appear  worth  the  extra 
capability. 

Certain  of  Pascal '»  data  type*  do  not 
appear  at  >11  in  Gypay.  These  are  types  "real", 
"class",  "pointer",  and  "file". 

Pascal  has  "If",  "case",  "for",  "while", 
"repeat",  and  "goto"  statement*  for  execution 
control.  Gypsy  has  a similar  set  of  statements, 
"if",  "case",  "loop",  and  "leave",  modified  for 
proper  placement  of  assertions  and  to  eliminate 
the  need  for  bracketing  "begln-end"  pairs.  The 
"if"  statement  is  conventional  except  for  a 
trailing  "end".  The  "case"  statement  has  an 
additional  keyword  "is"  and  an  optional  "else" 
clause.  The  "loop*  statement  subsumes  both  the 
"while"  and  "repeat*  constructs  as  well  as  the 
so-called  "loop-and-a-hal f"  construct  and 
infinite  loops.  Termination  and  looping  are 
controlled  by  "leave"  statements.  Gypsy  has  no 
"goto*  statement. 


Designing  for  Incremental  Development 


A language  that  is  to  support  the 
development  and  evolution  of  verified  programs 
also  must  consider  the  practical  aspects  of 
verification.  In  developing  a verified  program 
of  any  significant  size,  it  is  necessary  that 
the  program  be  written  as  a large  collection  of 
small,  independently  verifiable  units. 
Otherwise,  a formal  proof  easily  can  expand  into 
a mass  of  detail  and  become  unmanageable.  Also 
for  proofs  to  be  maximally  effective  they  should 
be  carried  out  on  a unit-by-unit  basis  as  the 
program  is  developed.  Further,  it  is  the  nature 
of  systems  programs  that  they  are  continuously 
undergoing  evolution  and  with  each  modification 
some  amount  of  reverification  is  necessary.  It 
is,  therefore  essential  that  the  amount  of 
reverification  be  kept  to  a minimum.  For  these 
reasons,  we  sought  language  features  which 
supported  unit-by-unlt  ■ manipulation,  increased 
unit  Independence,  and  isolated  unit 
interactions. 

A Gypsy  program  consists  of  a scries  of 
"routine",  "macro",  "constant",  and  "type" 
units)  which  may  appear  in  any  order.  If  a 
reference  can  not  be  resolved  locally  within  a 
particular  unit,  a search  of  the  other  external 
unit  names  is  made.  When  an  unresolved  local 
reference  is  found  to  be  an  external  unit  name, 
then  the  appropriate  Information  is  extracted 
and  the  analysis  continued.  Access  rights  to 
any  unit  may  be  stated  in  an  "access  list." 
These  access  lists  will  be  checked  during  the 
process  of  resolving  references.  The 
combination  of  units  and  access  lists  provides  a 
high  degree  of  code  independence,  plus  a tightly 
controlled  environment. 

A routine  is  a "function",  a "procedure", 
a "process",  or  a "program".  A "program"  unit 
defines  an  initial  entry  point.  Routine 
declarations  can  only  appear  at  the  unit  level; 
hence,  Gypsy  does  not  provide  a nested 
hierarchy.  Besides  favoring  unit  independence, 
it  was  felt  (1)  that  a hierarchical  structure 
failed  to  provide  adequate  program  protection 
without  across  lists  and  (2)  that  with  access 
liutn  and  without  nonlocal  variables  a 
hierarchical  structure  was  unnecessary. 
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A macro  unit  binds  a parameterized 
expression  to  a name.  While  macro  expansions 
can  bo  nested,  they  may  not  be  recursively 
expanded  as  there  would  be  no  way  to  terminate  a 
recursive  expansion. 

A constant  unit  parallels  Pascal's 
constant  declarations  except  that  a constant  may 
be  of  any  type  including  a structured  type. 
This  provides  the  means  for  referencing  global 
values  without  allowinq  qlobal  variables  or 
requiring  them  to  be  passed  as  parameters  if 
they  are  not  to  be  modified. 

A type  unit  declares  a new  type  either  by 
itemizing  its  value  set  or  by  composing  existing 
types.  A type  unit  which  includes  an  access 
list  is  the  equivalent  of  an  abstract  data 
structure  [71  [131  (211  (31  (91. 
The  intent  of  an  abstract  data  type  is  to  be 
able  to  construct  a new  type  and  to  restrict 
access  to  the  components  of  that  type  to 
operations  representative  of  the  type.  It 
should  be  possible  with  a proper  implementation 
to  alter  the  implementation  of  the  abstract  type 
and  the  corresponding  operations  without 
impacting  the  program  which  employs  the  abstract 
type. 


Designing  for  Concurrency  and  Real-Time 


Programming  languages  have  traditionally 
avoided  concurrency;  there  have,  however,  been 
exceptions.  The  Burroughs  family  of  extended 
Algol  languages  (141  provide  processes  and 
process  communication,  Bliss  f 1 9 ) provides 
coroutines  and  processes,  Concurrent  Pascal 
( 3 J combines  processes  and  monitors,  and 
Algol  68  [181  provides  collateral  elaboration 
of  clauses.  Several  other  languages  have 
primitive  means  of  accessing  operating  system 
functions  which  provide  concurrency.  Operating 
system  research  has  generated  a large  number  of 
concurrency  and  synchronization  techniques  which 
we  will  not  attempt  to  reference.  Two  systems, 
RC4000  (1)  and  HYDRA  (201,  which  were 
significant  factors  in  our  decision  on  how  to 
specify  and  implement  concurrency. 

Gypsy  has  a routine  type  called  a 
•process*.  It  differs  from  a "procedure"  only 
in  the  types  of  parameters  allowed  and  in  the 
manner  of  its  invocation.  Processes  communicate 
only  through  mess.aqe  buffets  (2).  A message 
"buffer"  is  a finite  length  queue  on  which  tf**re 
are  only  two  operations  defined, 
"send"  (enqueue)  and  "rece  (dequeue)  . The 
queue  is  manipulated  by  a strict  FCFS  algorithm. 
Whenever  a "send"  is  made  on  a full  buffer  the 
sending  process  is  suspended  until  the  condition 
is  remedied.  Likewise,  a "receive"  on  an  empty 
buffer  will  cause  the  process  to  be  suspended. 
Associated  with  every  buffer  is  a semaphore 
which  guarantees  mutually  exclusive  access  to 
the  buffer. 


one  of  several  buffer  operations.  An  "await"  is 
in  many  respects  a guarded  command  (81, 
except  that  Jt  has  a very  restricted  set  of 
guards  and  it  has  an  optional  time-out  clause. 
The  time-out  clause  specifies  what  is  to  be  done 
if  none  of  the  requested  operations  completes  by 
a certain  time. 


The  concept  of  (real)  time  is  provided  by 
"clock"  variables.  A clock  variable  is  a 
special  variable  which  may  not  be  modified  by 
the  program,  but  which  is  always  changing. 
There  may  be  any  number  of  clocks  in  a program, 
but  there  is  no  guarantee  that  they  will  be 
synchronized.  Gypr,y  programs  may  be  distributed 
across  many  machines  and  this  makes 
synchronization  virtually  impossible. 


Des ign ing  for  Imperfect  Execution  Environments 


An  attribute  of  real-time  software  often 
overlooked  in  programming  languages  is  the 
existence  of  both  hardware  and  software  faults. 
Fault  detection,  isolation,  and  recovery  is  an 
essential  function  in  real-time  software  and 
consequently,  languages  for  expressing  such 
software  should  (1)  provide  capabilities  for 
fault  control  programming  and  (2)  provide  an 
interface  to  the  hardware  which  allows  for  the 
detection,  isolation,  and  recovery  of  faults. 
The  work  of  the  Newcastle  group  ( 1 6 J 
represents  virtually  all  of  the  previous  efforts 
on  this  topic. 

A "condition"  in  Gypsy  is  an  instantaneous 
event  which  may  occur  during  the  execution  of  a 
program.  There  is  a large  class  of  predefined 
"conditions"  which  correspond  to  hardware  errors 
and  dynamic  language  semantics  errors,  such  as 
"caseerror".  Programmers  may,  in  addition,  name 
and  signal  fault  conditions  by  usinq  a "signal" 
statement  or  an  "otherwise"  clause  on  a 
specification  (discussed  in  the  next  section). 

Any  statement  ending  with  the  word  "end", 
may  optionally  end  with  a "condition  clause* 
followed  by  the  word  "end".  The  effect  of  the 
condition  clause  is  that  whenever  a condition 
occurs,  an  immediate  branch  is  taken  to  the 
condition  clause,  of  the  innermost  containing 
statement,  which  specifies  an  action  for  that 
condition.  Searching  for  the  innermost 
condition  clause  may  involve  exiting  a routine. 
After  the  condition  clause  is  executed,  control 
does  not  return  to  where  it  was  before  the 
fault,  but  instead  drops  out  of  the  statement 
whose  condition  clause  was  executed.  in  some 
sense,  a condition  clause  is  a restricted 
version  of  a PL/I  "on"  condition  which  resembles 
one  of  Zahn's  event  driven  case  statements 
122)  . 


pe ft i g n i ng  f o£  Spec i f (cation 


Concurrent  processes  are  initiated  by  a 
"cobcg in . .end"  statement  and  miy  or  may  not 
terminate.  Only  when  all  processes  called 
within  a "cobegin"  statement  terminate  will  the 
statement  following  the  "cobegin"  t»c  executed. 

Polling  is  an  important  tunct ion  of  real* 
time  systems;  hence,  it  nust  be  por.  nihlc  to  poll 
a buffer  without  being  suspended  indefinitely 
trying  to  receive  from  an  empty  buffer.  Gypoy 
has  an  "await"  statement  which  allows  the 
s imu  1 1 *n«*ous  waiting  on  the  completion  of  any 


Gypsy  pi  jyr;  the  dual  role  of  programming 
and  specification  language.  The  specification 
component  of  the  language  permits  the  precise 
expression  of  desired  tunct ional  properties  of 
key  p.irts  of  the  program.  These  properties  are 
stated  in  terms  of  val id  states  that  are  to  be 
maintained  on  the  data  object©  of  the  program  at 
various  points  in  the  program  computation.  The 
objective  of  a verification  is  to  show  that  the 
computation  always  proceeds  in  conformity  with 
the  stated  specifications.  The  conformity  of 
the  program  with  its  specification  can  in  most 
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cases  be  either  proved  prior  to  execution  or 
validated  during  execution.  The  lame 
specification  methods  are  used  in  both 
approaches  to  verification. 

All  specifications  in  Gypsy  are  stated  as 
boolean-valued  expressions.  These 
specifications  are  designated  to  be  verified 
either  by  proof,  by  rur-time  validation,  or 
simply  assumed.  Specifications  that  are  proved 
or  assumed  need  not  be  evaluated  at  run-time, 
and  therefore,  they  are  permitted  to  contain 
special  operations  and  types  that  could  not 
otherwise  be  permitted.  For  example,  boolean 
expressions  may  contain  the  logical  quantifiers 
•for  all*  and  "there  exists*  and  refer  to 
rational  numbers  and  infinite  sequences. 

The  most  familiar  kinds  of  specifications 
used  in  Gypsy  are  the  "entry",  "exit",  and 
"assert*  statements  for  procedures  and 
functions.  These  follow  the  same  form  as  that 
introduced  by  (11)  for  proving  Pascal 
programs.  The  "exit"  specification  lu 
Interpreted  in  the  weak  sense,  i.e.  it  holds  if 
the  program  terminates. 

"Entry*,  ’exit*,  and  "assert* 
specifications  also  can  be  used  with  processes. 
Bowever,  processes  often  are  intentionally 
programmed  never  to  terminate,  and  therefore  an 
•exit*  specif ication  may  be  of  no  value. 
Specifications  can  be  stated  for  non-terminating 
processes  through  "block"  specifications.  A 
"block*  specification  holds  whenever  the  process 
is  suspended  by  a buffer  operation.  This 
provides  a temporary  halting  point. 

Specifications  for  routines  performing 
buffer  manipulations  normally  are  stated  in 
terms  of  effects  on  buffer  histories.  In  the 
terminology  of  (6),  these  are  "mythical 
variables",  but  they  are  provided  in  a uniform 
way  rather  than  being  installed  by  the 
programmer.  Associated  with  every  buffer  b 
are  several  histories  that  are  relevant  to 
specifications  and  to  the  proof  methodology. 
For  example,  "b.infrom"  refers  to  the  sequence 
of  objects  received  in  from  the  buffer  by  the 
process,  and  "b.outto"  is  the  sequence  of 
objects  sent  out  to  the  buffet  from  the  process. 

Any  sequence  of  *var*  declarations  can  be 
followed  by  a "keep"  sped f ication . The  "keep* 
expression  must  be  maintained  throughout  the 
immediate  scope  of  the  "var*  declaration.  A 
procedure  or  function  call  releases  the  "keep*, 
but  the  called  unit  must  reestablish  it  before 
returning.  This  type  of  assertion  is  similar  to 
those  used  by  (17)  for  run-time  validation. 

Routines  that  have  access  to  the  internal 
structure  of  a type  have  both  internal  and 
external  specif icat ions . This  follows  the 
sped  1 icat ion  methods  of  Alphard  forms  (211. 
External  specification*  are  visible  to  the 
outside  environment!  internal  ones  are  not.  The 
external  specifications  are  the  "entry*, 
'block*,  and  "exit*  specifications.  "Centry*, 
•cblock",  and  "cexit"  are  the  corresponding 
internal  epecit teat  ion* . The  internal 
specifications  may  refer  to  the  internal 
(concrete)  structure  of  accessible  types;  the 
external  sped  f icat  ton*  may  not. 

Two  kinds  of  specifications  can  be  stated 
for  Gypsy  type  definitions,  "require*  and 
"axiom*.  The  require  spidficatlon  follows 
Alphard  and  is  a precondition  on  the  type 
parameters  that  1*  necessary  for  the  pioprr 


creation  of  an  object  of  that  type.  The  axiom 
is  a relation  among  the  functions  that  have 
access  to  the  type. 

This  set  of  specification  methods  provides 
powerful  mechanisms  for  stating  functional 
properties  of  programs,  and  formal  proof  methods 
have  been  defined  for  proving  each  of  these 
types  of  properties.  The  specifications  do  not, 
at  this  time,  directly  permit  the  definition  of 
quantitative  aspects  of  program  behavior  such  as 
resource  utilization. 


A Message  Switching  Network 


The  following  example  follows  part  of  the 
development  of  a simple  message  switching 
network  and  illustrates  many  of  the  Important 
features  of  Gypsy.  Only  the  specification  and 
implementation  of  the  network  will  be  discussed. 
Its  verification  is  beyond  the  scope  of  this 
paper.  The  development  of  the  network  will  be 
top-down,  but  Gypsy  admits  any  kind  of  program 
design  strategy. 

The  top-level  structure  of  the  network  Is 
shown  in  figure  1.  The  network  switches 
messages  among  a fixed  number  of  users,  each  of 
which  communicates  with  the  network  through  a 
port.  He  will  ignore  protocols,  and  assume  that 
each  message  is  a separate,  complete 
communication.  Even  at  this  early  stage  of 
development,  the  network  can  be  written  in 
Gypsy . 

program  Network (var  upa : Por tAr ray ) • pending; 

type  PortArray  » array (Userid)  of  Port; 

type  Userid  * integer ( 1 . .NUsers) ; 
const  NUsers: integer  • pending; 

type  Pott  « record(Get,Put:Line) ; 
type  Line  « buffer (CSize)  of  Message; 
const  CSize : integer  * pending; 

type  Message  - pending; 


Figur*  1 
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This  prograr  qivvs  a precise  description  of  the 
lines  of  communicat ion  between  the  network  and 
its  external  environment  Corr.nunic.it  ion  is 

through  an  "ar ray (Us?r Id)  of  Port."  Each  port  is 
a record  consisting  of  two  buffers,  and  each 
buffer  contains  a maximum  of  csize  messages. 
The  type  asend  is  an  integer  restricted  to  the 
range  (l.nusers).  The  actual  number  of  users, 
the  maximal  buffer  si2e,  the  structure  of 
messages,  and  the  implementation  of  the  network 
ere  left  pending. 

In  a simple  network  with  no  protocols,  the 
fundamental  speci f icat ion  that  is  desired  is 
that  messages  are  delivered  properly  among  all 
possible  pairs  of  users.  This  specification  for 
the  network  can  be  written  as 

program  Network (var  upa : Poc tAr r ay)  ■ 
begin 

block  all  i,j:usorid, 

ProperOel ivery ( i , j,upa) ; 
pend i ng ; 

• no ; 

The  specification  is  written  as  a block,  instead 
of  an  exit,  specification  because  we  intend  the 
message  network  to  be  non-terminating.  The 

network  being  blocked  means  that  all  processes 
in  the  network  are  blocked.  This  could  happen 
for  any  number  of  reasons,  including  deadlock, 
but  in  this  example,  it  will  mean  that  there  is 
no  further  input  available  from  any  user. 

Before  we  can  proceed  with  the 

implementation  of  the  network,  it  is  necessary 
that  we  be  more  specific  about  the  meaning  of 
•properdel ivery. " Loosely  speaking,  what  we 
mean  is  that  user  j receives  only  those  messages 
that  were  intended  for  j.  We  will  make  this 
definition  precise  with  a macro, 

define  ProperDe 1 ivery ( i , j ,pa)  * 
mail (pa  (j) . Put .out  to, i , j ) 

cub  ma il (pa ( l ) .Get . inf rom, i , j ) ; 

(The  macro  definition  was  chosen  to  illustrate 
the  use  of  macros.  ProperOel ivery  also  could 
have  been  defined  using  a function.) 

Pa( j) .put.outto  is  the  sequence  of  all  messages 
sent  out  to  buffer  pa(j).put  by  the  network,  and 
pa (i) .get . inf rom  is  the  sequence  of  messages 
received  in  from  buffer  upa (i). get.  The 

function  mail (ms,i, j)  is  the  subsequence  of 
messages  in  message  sequence  ms  that  are 

directed  from  port  i to  j. 

The  completion  of  the  definition  of 
properdel l very  requires  a precise  definition  of 
the  mail  function,  and  mail  in  turn  will  require 
some  additional  information  about  messages. 

f unct ion  ma il (ms : Mess age Sequence  j 1 , j : Userid) 

: Message-Sequence  « 
beg  in 

exit  (assume  ma i 1 (rs , i , j ) * 
if  ms-MessaieSequence ( ) 
then  Mer.sageScqucnce  ( ) 
else  if  i * Source (f irst (ms) ) and 
j - Dest inat ion(f lint (ms) ) 
then  MessageScooonce (first (ms) ) 

0 Mail (nonf irst (ms) , i ,j) 
else  Ma  il (nonf irst (ms) , l , j ) 

ti  ti); 

end  i 

type  McBnaqeSequ'tnf  * • sequence  of  Message; 

type  Me s sage < Source, nest inat ion, Text , Compose, 
Equal > * 
begin 


axiom  all  m:M«-s:»agc, 

Equal (Compose (Source (m) , Dost inat ion (m) , 
Text (m) ) ,m) j 

pend ing ; 

end; 

function  Sour ce (m: Message) :U3er Id  • 
pending ; 

function  Destination (m:Message) :User Id  * 
pending; 

function  Text (mtMessage) :CStr ing  - 
pend ing ; 

funct ion  Compose (s ,d : User  id ; t sCStr ing) 

:Message  * pending; 

function  Equa 1 (ml , m2 : Message) {boolean  • 
pending ; 

type  CString  • sequence  (100)  of  char; 

The  definition  of  mail(ms,l,j)  is  given  as  an 
assumed  exit  specification  which  gives  a 
complete  recursive  definition  of  mail.  The 
definition  of  mail  requires  a new  type, 
messagesoquence . The  type  definition  "sequence 
of  message"  defines  a potentially  infinite 
sequence  of  messages.  Sequences  are  given  a 
precise  meaning  by  the  semantics  of  Gypsy,  but 
it  is  not  necessary  that  they  be  implemented. 
Gypsy  has  a number  of  these  kinds  of  constructs. 
They  are  included  for  purposes  of  formal  program 
analysis,  and  may  appear  anywhere  in  a program 
where  execution  is  not  required,  such  as  in 
specifications  that  are  proved  or  assumed.  In 
contrast,  the  type  cstring  is  a sequence  of 
ASCII  characters  of  maximal  size  100.  Normally 
a Gypsy  implementation  would  contain  finite 
sequences  but  not  infinite  ones.  Size 
restrictions  can  be  enforced  by  run-time  checks, 
and  both  kinds  of  sequences  share  a common 
semantics.  Messagesequence ( ) denotes  the  empty 
sequence  of  messages.  In  general,  type  names 
can  be  used  to  construct  objects  of  that  type. 
The  0 operator  is  the  sequence  append  operator. 

The  definition  of  mail  makes  use  of  two 
functions  on  messages,  source  and  destination. 
The  type  definition  of  message  permits  these 
functions,  as  well  as  text,  compose,  and  equal, 
access  to  the  internal  structure  of  messages, 
which  is  left  pending.  The  axiom  states  an 
identity  relation  that  must  be  maintained  among 
this  set  of  functions.  This  axiom  implies  that 
three  kinds  of  information  can  be  extracted  from 
a message,  a source,  destination,  and  text  part. 
The  source  and  destination  are  the  means  of 
directing  a message  from  one  user  to  another, 
and  the  text  is  the  actual  content  of  the 
message  to  be  transmitted.  ?b-  compose  function 
builds  a message  from  these  three  parts,  and 
•qua  1 defines  a message  equality.  Iris  is  the 
only  information  that  we  will  need  to  know  about 
messages  to  carry  out  the  full  specification, 
implementation,  and  verification  of  the  network 
process.  Eventually,  of  course*,  we  must  choose 
a concrete  representation  of  messages,  and  prove 
that  the  representation  and  the  implementation 
of  the  functions  that  con  access  it  satisfy  the 
axioms . 

Now  wc  can  give  a completely  precise 
interpretation  to  proicrdel ivery . For  every  i,j 
pair,  the  mail  from  source  i that  is  sent  out  to 
port  j must  be  a subsequence  of  the  mail 
received  in  from  port  i that  is  designated  for 
destination  j.  This  requires  that  the  mc-ssaies 
be  the  same  and  that  they  arrive  in  th*'  same 
otd'-r  that  they  wore  sent.  The  subsequence 
relation  permits  the-  network  to  drop  messages. 
Thin  in  a concession  to  the  reality  of 
potentially  unrecoverable  transmission  failures. 
This  completes  t lie  specification  of  network. 
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We  can  proceed  with  the  top-down  design  at 
any  place  In  the  current  Gypcy  program  where  a 
pending  appears.  There  are  many  ways  this 
program  could  be  Implemented  to  satisfy  the 
block  specification,  but  we  will  choose  the 
f ol lowing . 

program  networklvar  upatPortArray)  « 
begin 

block  all  l,):userld, 

Proper Del ivery (i ,j, upa) ; 
vat  npa: Pot tAr ray; 
cobeg  1 n 

Node (upa (1) ,npa (1) , 1)  each  i t Userid) 
switch (npa) ; 
end; 
end) 

process  Nodetvar  up.np-.Pot  t i iiUaet  Id)  • 
begin 

block  up.Put.outto  sub  np.Put, inftom 

and  np.Cet.outto  sub  up. Get. Inf romj 
pending; 
end) 

process  Switchfvar  npa : Poe tAr ray ) • 

begin 

block  all  i , J ; userid , 

ProperDel ivery ( i , j , npa) ; 
pending; 
end; 

This  implements  the  program  as  a star  network 
where  each  user  is  attached  to  exactly  one  node, 
and  all  of  the  nodes  are  connected  to  a single 
switch  as  shown  in  figure  2.  Each  node  is 
similar  to  a full-duplex  channel  program  passing 
messages  unaltered,  and  In  sequence,  between  the 
user  and  the  central  switch.  All  of  the  nodes 
and  the  switch  are  set  into  concurrent  execution 
by  the  cobwgln  in  the  network  program. 

A node  can  be  implemented  by  decomposing 
it  into  two  one-way  channels  operating 
asychronously . 

process  Nodefvar  up,np:Port; itUserld)  - 
begin 

block  up.Put.outto  sub  np. Put . inf rom 
and  np.Cet.outto  sub  up. Get. inf rom; 


figure  2 


cobegin 

Pass (np.Put , up. Put , i .Depart) ; 

Pass (up. Get, np.Get,i, Arrive) ; 
end) 
end; 

process  Pass(var  x.yiLine;  iiUacrld; 

d.'Direction)  • 
begin 

block  y.outto  sub  x. inf rom; 

var  miHessage) 

loop 

assert  y.outto  suh  x. inftom; 
receive  m from  x; 
trace  l ■ if  d « Depart 
then  Destinatlon(m) 
else  Source(m)  fi; 
send  m to  y) 
end; 
end; 

type  direction  - (Arrive,  Depart); 

Pass  is  intentionally  progtammed  as  a non- 
termination  loop.  The  loop  simply  receives 
messages  from  line  x and  passes  them  on  to  line 
y performing  a trace  depending  on  the  value  of 
d.  The  send  and  receive  statements  are 
potential  blockage  points,  and  these  are  the 
points  where  the  block  specification  must  hold. 
A cobegin,  as  in  node  or  network,  also  is  a 
potential  blockage  point. 

The  switch  process  also  loops  forever 
waiting  on  each  buffer  in  its  turn  for  a small 
time  slice.  If  input  is  ready  it  will  receive 
it;  otherwise,  it  will  time  out  and  go  on  to  the 
next  buffer. 

process  Switch (var  npa : Por tAr ray)  * 
begin 

block  all  l,j:Userld, 

ProperDel ivery ( i , J ,npa) ; 
var  m.'Hessage; 
var  ktUserld; 
cond  DestinationErr; 
keep  Destination (m)  in  (l..NUsers) 
otherwise  DestinationErr; 

loop 

k i*  1) 
loop 

if  k > NUsers  then  leave  end; 
assert  all  i,j:UaerId, 

ProperDel ivery ( i , j ,npa) ; 

await 

on  receive  m from  npa(k|.Geti 

send  m to  npa (dest inat ion (m) ) . Put ; 
after  TimeSlicei  ; 
when 

is  DestinationErr:  ; 
end; 

k :•  k ♦ 1 ; 
end ; 
end; 
end; 

const  TimeSl ice: integer  ■ pending; 

Switch  repeatedly  iterates  through  the  get 
buffers  of  the  ports  attempting  to  receive  a 
message.  Control  leaves  the  inner  loop  at  the 
leave  statement,  and  the  outer  loop  runs 
indefinitely.  If  a message  is  not  received  in 
times! ice  amount  of  time,  the  await  is  exited 
and  the  next  buffer  is  considered.  If  a message 
is  received  within  the  allocated  amount  of  time, 
it  is  sent  to  the  opprcprlatc  destination.  The 
keep  specif ication  of  switch  ;s  evaluated  each 
time  one  of  its  variables  is  assigned  a new 
value.  If  the  sped  1 irat Ion  ever  is  violated,  a 
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destination  euoi  Is  signalled.  The  keep 
prevents  an  invalid  array  index  in  the  send 
statement.  If  the  error  occurs,  control  is 
transferred  to  the  when  part  of  the  await  and 
the  destinationerr  part  of  the  when  is 
performed.  In  this  case,  switch  does  nothing, 
thus  dropping  the  message.  This  conforms  with 
the  subsequence  relation  specified  in 
prope  delivery. 


The  process  structure  of  the  complete 
network  is  shown  in  figure  3.  All  of  these 
processes  run  concurrently.  The  intermediate 
level  of  a node  process  was  not  necessary.  The 
pass  processes  could  have  been  invoked 
eaplicitly  from  the  cobegin  in  the  network.  The 
extra  level  of  decomposition  is  helpful 
conceptually  and  in  breaking  the  network  into 
small,  individually  verifiable  components. 


Now  let  us  return  to  the  implementation  of 
messages.  They  will  be  implemented  in  the 
obvious  way  as  a record  of  three  fields, 

type  Nessage<Source,Oestlnat ion, Text (Compose 

Equal)  - 
begin 

axiom  all  miMessage, 

Equal (Compose (Source  Cm) (Destination (m) , 
Text (ml ) ,m|  ; 

record fs,d:Userld;  tiCString); 
end  i 

function  Source (miMessage) :UserId  • 
beg  in 

cexit  Source (m)  - m.s; 
result  : ■ m.s; 
end) 

function  Destination (m:Ne3Sage) :UserId  » 
begin 

cexit  Dest inat ion (m)  * m.di 
result  »-  m.d; 
endj 

function  Text (miMessage) iCString  ■ 
begin 

cexit  Text(m)  • m.t; 
result  t«  m.t; 
end  i 


Figure  3 


function  Compose (s ,d : User  Id j tiCString) 
iMessagc  • 
begin 

cexit  Compose (s ,d , t)  » Message (s ,d , t) ; 
result  i*  Message (s ,d, t) ; 
end  i 

function  Equal (ml ,m2:Message) iboolean  * 
begin 

exit  Equal(ml,m2)  iff  Equal (m2, ml) ; 
cexit  Equal(ml,m2)  iff 

ml.scm2.s  and  ml.d*m2.d  and  ml.t*m2.t; 
result  :*  ml.s=m2.s  and  ml.d«m2.d 
and  ml.t«m2.t; 

end ; 

In  the  functions  that  are  permitted  access  to 
the  internal  structure  of  messages,  centry  and 
cexit  specifications  also  are  permitted  access 
to  the  internal  structure,  but  entry  and  exit 
specifications  are  not.  Entry  and  exit 
sped f ications  are  visible  externally,  to  the 
routines  that  call  the  functions,  but  centry  and  * 
cexit  specifications  are  not.  This  prevents  the 
external  specifications  from  revealing  the 
internal  structure  of  messages.  In  a function 
the  local  variable  with  the  reserved  name 
"result”  is  the  value  assigned  to  the  function 
upon  exit.  It  can  be  used  in  the  same  way  as 
any  other  local  variable.  In  the  function 
compose,  message  (s ,d , t)  is  another  example  of 
the  type  name  used  to  construct  an  object  of 
that  type.  Message ( s ,d , t)  creates  a message 
with  successive  fields  equal  to  s,  d,  and  t. 

This  completes  the  program  and  its 
specifications  except  for  assigning  values  to 
the  pending  constants  nusers,  csize,  and 
timeslice.  The  program  at  this  stage  of 
development  can  be  written  as 

program  network(var  upa : Por tAr ray)  * 
begin 

block  all  i,j:userid. 

Proper  Del ivery ( i , j , upa) ; 
var  npa : Por tAr r ay ; 
cobegin 

Node  (upa ( i) , npa ( i) , i)  each  i r Userid; 
switch (npa) > 
end ; 
end; 

type  PortArray  • ar ray (User Id ) of  Port; 

type  Userid  - integer [1 . .NUsers) ; 
const  NUsers: integer  » pending; 

type  Port  - record (Get , Put ; Line) ; 
type  Line  - buffer  (CSi re)  of  Message; 
const  CSize : integer  « pending; 

define  ProperDel ivery ( i , j , pa)  * 
mai 1 (pa ( j ) . pu t . out  to , i , j ) 

sub  ma i 1 (pa ( i ) . ge t . inf rom,  i , j ) ; 

function  ma i 1 (ms : MessageGeguence ;i,j: Userid) 
iMcssagcSequence  » 
begin 

exit  (assume  mail(ms,i,j)  » 
if  ms^MessageSequence ( ) 
then  Mr ssageSequence  ( ) 
else  if  i - Source (C irst (ms) ) and 
j ■ Dostlnat ion (f irst (ms) ) 
then  MessagoSequence (first (ms) ) 

P Mai  1 (nonf irst (ms) , i , j ) 
else  Mai  1 (nonf irst (ms) , i,j) 

fi  fi)! 

end ; 

tyiv  McssaqeEnjuencc  - sequence  of  Message; 


413 


process  Node (var  up, npi For t ; 1 : Uaer Id)  • 
begin 

block  up.Put.outto  tub  np. Put. inf  tom 
and  np.Get.outto  iub  up. Cat. inf rom; 
cobog in 

passtnp. Put, up. Put, i, Depart)  ; 

Pasalup. Cet.np. Get, i, Arrive)  ; 
endy 
end ) 

ptoceaa  Peeefver  i,yiLinei  liUaerld; 

dtDirection)  • 
begin 

block  y.outto  eub  x.infrom; 

vet  miMeetagei 
loop 

•eeert  y.outto  iub  x.infrom; 
teceive  • from  «; 
trace  1 * if  d * Depart 
then  Destination (m) 
else  Source(n)  fit 
send  • to  yi 
end  i 
end  i 

type  direction  - (Arrive,  Depart); 

process  Switchfvar  npasPortArray)  • 
begin 

block  all  i,j:UserId, 

Proper Deliveryfi, j,npa) ; 
var  n : Message ; 
var  k:Userld; 
cond  DestinationCrr; 
keep  Destination(m)  in  (l..NUsers) 
otherwise  DestinatlonErr; 

loop 

k Is  1 1 

loop 

if  k > NUeers  then  leave  end; 
assert  all  i,j:UserId, 

ProperDeliveryd,  j,npa) ; 
await 

on  teceive  a from  npa(k).Get: 

send  in  to  npa  (destination  (a) ) .Put; 
after  Tiaesllces  ; 
when 

is  DestinatlonErr:  ; 
end; 

k :•  k ♦ 1; 
end; 
end; 
end; 

const  TlmeSllceiinteger  • pending; 

type  Message <Source,Dest inat ion .Text (Compose, 
Equal)  " 
begin 

axiom  all  a:Message, 

Equal (Compose (Source (ra) , rest inat ton (m) , 
Text (■) ) ,m) ; 

record (s ,d: User  id;  tiCString); 
end; 

type  CStrlng  • sequence  (1(8)  of  char; 

function  Source (miMesaage) :UserJd  • 
begin 

cexit  Source(m)  * «.s; 
result  n.s; 
end; 

function  Dest  inat  ion  (m:Mossage)  tUr.er  Id  - 
begin 

cexit  bestinatlon(m)  • n.d; 
result  :*  m.d; 
end ; 

function  Text(m:Message) :CStr ing  * 


begin 

cexit  Taxt(m)  • m.t; 
result  :•  m.t; 
end; 

function  Compose (s ,d lUserld ; tiCString) 

iMossage  • 

beg  in 

cexit  Coaipoae (a ,d , t)  • Message  (s,d,t) ; 
result  i - Messaged, d.t ); 
end; 

function  Equal (ml ,m2:Message) rboolean  ■ 
begin 

exit  Equal (ml, m2)  iff  Equal (m2 , ml ) ; 
cexit  Equal (ml, m2)  iff 

ml.s*»2.s  and  ml.d*m2.d  and  ml.t>si2.t; 
result  :•  ml. cm2. a and  ml.d>m2.d 
and  ml . t-m2 . t; 

end; 

» 

There  are  many  details  of  Gypsy  that  this 
example  does  not  illustrate,  but  the  development 
of  this  program  and  ltB  specifications  provides 
a good  overview  of  the  philosophy  and 
capabilities  of  the  language. 


Conclusion 


Gypsy  has  a number  of  important  and 
distinctive  aspects.  It  is  a high-level 
language  for  qenecal  purpose  computing  that  also 
supports  the  development  of  systems  programs. 
It  includes  facilities  for  concurrency  and 
timing,  execution  in  Imperfect  run-time 
environments,  and  an  access  control  mechanism. 
Gypsy  includes  extensive  and  powerful  facilities 
for  expressing  functional  specifications  of  its 
programs  and  of  the  units  from  which  the  program 
is  structured.  All  constructs  in  Gypsy  are 
verifiable  either  by  formal  proof  or  run-time 
validation.  Run-time  validation  can  be  used 
effectively  to  reduce  the  size  and  complexity  of 
the  formal  proofs.  Facilities  are  provided  for 
decomposing  both  routines  and  data  into  small, 
logically  meaningful,  units  that  can  be  verified 
Independently.  This  modularity  greatly  enhances 
the  practical  feasibility  of  formal  proofs.  We 
believe  that  Integrating  these  features  smoothly 
into  a common  language  is  a significant  step 
forward  in  the  design  of  languages  to  support 
the  systematic  development  of  highly  reliable 
computer  programs. 


Blbl lography 


(1)  Brinch  Hansen,  Per.  "The  Nucleus  of  a 

Multiprogramming  System,"  CACM  13,  4 

(1970)  . 

(2)  Brinch  Hansen,  Per.  "Operating  Systems 

Principles,"  prent icc-Hal l (1973). 

(3)  Brinch  Hansen,  Pet.  "The  Purpose  of 
Concurrent  rascal,"  Proceedings  1CHS 
(1975)  . 

[4>  Burger,  Wilhelm.  “Formal  Semantic 
Definition  of  GYPSY",  (in  preparation). 

|S|  Buxton,  J.N.  and  B.  Randcll,  eds . 

Software  Engineering  Techniques,  NATO 
Science  Committee  (1970). 

[ 6 | Clint,  M.  "Program  Provinq:  Co-routines," 

Acta  Informatics,  2 (1973). 


414 


1 7 1 Dahl,  0.  -J.  "Notes  on  Data  Structur  ing , * 
Dahl,  Dijkstra,  and  Hoare,  Structured 
Programming,  Academic  Press  (19721, 

IS)  Dijkstra,  fdsyer  W.,  "Guarded  Commands, 
Nondetermlnacy , and  Formal  Derivation  of 
Programs,"  CACM  18,  8 (1975). 

I9|  Flon,  Lawrence.  "A  Survey  of  Some  Issues 
Concerning  Abstract  Data  Types,"  Technical 
Report,  Car negie-Me 1 Ion  (1974). 

11*1  Cood,  D.I.,  and  L.C.  Ragland.  "Nucleus--A 
Language  for  Provable  Programs,"  Program 
Test  Methods,  Hetzel  (ed.),  Prentice-llall 
(1973)  . 

(11)  Igarashi,  S.,  R.L.  London,  and  D.C. 
Luckham.  "Automatic  Ptogiam  Verification 
I)  A Logical  Basis  and  Its 
Implementation,"  Report  IS1/RR-73-1 1 , Inf. 
Scl.  Inst.  USC  (1973). 

(12)  Jensen,  Kathleen  and  Niklaus  wirth. 
"Pascal  User  Manual  and  Report,"  Springer 
Vcrlag  (1974) . 

(13)  Liskov,  Barbara  and  Stephen  Zllles.  "An 
Approach  to  Abstract  ion, * Computation 
Structures  Group  Memo  88,  MIT  (1973). 

(14)  Lyle,  Don  M.  "A  Hierarchy  of  High  Order 
Languages  for  Systems  Programming,"  Proc. 
of  SIGPLAN  Symp.  on  Languages  for  Systems 
Implementation  (1971). 

(151  Moriconi,  Mark  S.  "An  Interactive  System 
for  Incremental  Program  Design  and 
Verification,"  (in  preparation). 

(16)  Randall,  B.,  "System  Structure  for 
Software  Fault  Tolerance,"  Proceedings 
ICRS  (1975). 

(17J  Stucki,  L.G.  "Testing  Impact  on  the  Futur' 
of  Software  Engineering,"  Proc.  of  Fourth 
Texas  Conf.  on  Computing  Systems, 
University  of  Texas  (1975). 

(18J  van  Wijngaarden,  A.,  et  al.  "Report  on  the 
Algorithmic  Language  ALGOL  68,"  Numerirche 
Mathematik,  14  (1969). 

(19)  Wulf,  N.A.,  D.B.  Russell,  and  A.N. 
Haber.xann.  "BUSS:  A Language  for 

Systems  Programming,"  CACM  14,  12  (1971). 

(28)  Wulf,  w.,  R.  Levin,  and  C.  Pierson. 
•Overview  of  the  Hydra  Operating  System 
Development,"  Proc.  of  Fifth  Symp.  on 
Operating  Systems  Principles,  (1975). 

121)  Wulf,  W.A.,  R.L.  London,  and  Mary  Shaw. 
"Abstraction  and  Verification  in  Alphard", 
preliminary  draft,  Carnegie-Mel Ion  (1976). 

(22)  lahn,  Charles  T.,  "A  Control  Statement  for 
Natural  Top-Down  Structured  Programming," 
Symp.  on  Programming  Language!  (1974). 


jL 


“15 


LIST  OF  ATTENDEES 


Invitational  Software  V £ V Conference 
Aug  3,4,5  1976 


Lt  James  Boyd 
RADC/OCSP 

Griffiss  AFB  NY  13441 
315-330-4481 

Mr.  J.  Brookshire 

U.S.  Army  Missile  Command 

DRMSI-RGG 

Redstone  Arsenal  AL  35809 
205-876-2601 

Bill  Carlson 

Defense  Advanced  Research  Projects  Agency 
1400  Wilson  Blvd 
Arlington  VA  22209 
Autovon  224-5037 

Mr.  M.E.  Champaign 
CCCTC-WAD  Code  C420 
11440  Isaac  Newton  Square  N. 

Reston  VA  22090 
703-437-2326 

Dr.  T.  E.  Cheatham 
Aiken  Coup.  Lab 
Harvard  University 
Cambridge  MA  02138 
617-495-3989 

Prof.  Roger  Cheung 
Dept,  of  Comp.  Science 
Norti  western  University 
Evanston  IL  60201 
312-492-5246 

Lori  Clarke 
Dept  of  Conp  Science 
University  of  Mass acn use ttes 
Amherst  MA  01002 

Hoy  Coppinger 
H.O.S. 

843  Mass.  Ave 
Cambridge  MA  02139 
617-661-8900 


Dr.  Edward  Crapp 
CTEC 

7777  Leesburp  Pike 
Falls  Church  VA  22041 
703-821-3700 

Steve  Crocker 

U.S.C.  Information  Sci.  Institute 
4676  Admiralty  Way 
Marina  Del  Rey  CA  90291 
213-822-1511 

Dr.  John  Barringer 
IBM  Pesearch  Center  Lab 
P.0.  Box  218 

Yorktcwn  Heights  NY  10598 
914-628-1880 

Donald  DeVorkin 
C.S.  Draper  Lab 
75  Cambridpe  Parkway 
Cambridge  MA  02142 
617-258-1225 

Dr.  Bernard  Elspas 
Comp.  Sci.  Group 
333  Kavenswood  Ave 
Menlo  Park  CA  94025 
415-326-6200 

Carl  ihgelrran 
MITRE  Corp. 

P.0.  Box  208 
Bedford  MA  01730 
617-271-2805 

Carolyn  Gannon 
General  Research  Corp. 

5383  liollister  Ave 
P.0.  Box  3587 
Santa  Barbara  CA  93105 
805-964-7724 


a 


416 


John  B.  Glore 
MITRL  Corp. 

P.0.  Box  208 
Bedford  MA  01730 
617-271-3022 

Dr.  Donald  Good 
Comp.  Science  Dept. 
University  of  Texas 
Austin  TO  78712 
471-4353 


Ilaj  Ben  A.  Johnson 
BMDATC  Attn:  ATC-P 
Box  1500 

Huntsville  AL  35807 
205-895-4114 

Larry  Johnson 
LOGICON,  Inc. 

P.0.  Box  2566 
Framingham  MA  01701 
617-274-6333 


Robert  M.  Graliam 
University  of  Mass. 

Coins  Graduate  Research  Center 
Amherst  MA  01002 

Walter  Graves 
TRW 

One  Space  Park 
Bldg  90,  Rm  2876 
Redondo  Beach  CA  90278 
213-535-3531 

Paul  hil finger 
Dept  of  Conp  Science 
Camegie-Mellon  University 
Pittsburg  PA  15213 
412-687-5861 

R.H.  Hoffman 
TRW 

16011  LI  Camino  Real 
Houston  TO  77062 
713-333-3133 

Michael  Ikezawa 
LOGICON,  Inc. 

255  West  Fifth  St 
San  Pedro  CA  90733 
213-831-0611 

Capt  Jack  Ives 
RADC/ISIS 

Griff iss  AFB  NY  13441 
315-330-7010 

Maj  Lmest  Jackson 
Patriot  Proj.  Office 
Attn:  AMCPN-T-ES 

Redstone  Arsenal  AL  35809 
Autovcn  742-3755 


David  Kallran 
MITRE  Corp. 

Box  208 

Bedford  MA  01730 
617-271-2040 

Col  Robert  Krutz 
RADC/IS 

Griff iss  AFB  NY  13441 
315-330-2204 


Frank  LaMonica 
RADC/ISIM 

Griffiss  AFB  NY  13441 
315-330-7834 

Micliael  Landes 
RADC/ISM 

Griffiss  AFB  NY  13441 
315-330-2672 

Larry  Lombardo 
RADC/ISIM 

Griffiss  ATB  NY  13441 
315-330-2672 

Prof.  D.C.  Luckham 
Stanford  University 
Palo  Alto  CA  94305 
415-497-4971 


Ann  Mamour-Squires 
National  Securitv  Agency 
R142 

Ft  Meade  MD  20755 
301-688-7125 

James  Meehan 
LOGICON,  Inc. 

255  West  Fifth  St 
San  Pedro  CA  01701 
213-831-0611 


417 


John  K.  Miller 
MITRE  Corp. 

P.0.  Box  208  M.S.  B070 
Bedford  MA  01730 
217-271-2036 

Charles  McClure 
MITRE  Corp. 

P.0.  Box  208 
Bedford  MA  01730 
617-271-2175 

Dr.  Edward  Miller 
Science  Applications  Inc. 
244  Kearny  St 
San  Francisco  CA  94108 

Lt  Col  Mchemie 
AFOSR/NM 
Bldg  410 

Bolling  AFB  Wash  DC  20332 
202-693-0028 

John  McLean 
RADC/ISIS 

Griffiss  AFB  NY  13441 
315-330-7010 

Dr.  David  Musser 
USC  Information  Sciences 
4676  Admiralty  Way 
Marina  Del  Fey  CA  90291 
213-822-1511 

Lawrence  Nicola 
Science  Applications  Inc. 
2109  West  Clinton  Ave 
Huntsville  AL  35805 
205-533-5900 

Dean  Nordman 
GRC 
RD  H3 

Rome  NY  13440 
315-336-2545 

Leon  Usterweil 
Dept  of  Coup  Sci 
University  of  Colorado 
Boulder  CO  80309 
303-492-8787 


R.  Pennington 
General  Research  Corp. 
5383  Hollister  Ave 
Santa  Barbara  CA  93105 
805-964-7724 

Barry  Press 
TFW 

One  Space  Park  Dr 
Redondo  Beach  CA  90278 
213-535-3535 

P.  Radatz 
LOGICON,  Inc. 

255  West  Fifth  St 
P.0.  Box  471 
San  Pedro  CA.  90733 
213-831-0611 

Don  Reifer 
Aerospace  Corp 
P.0.  Box  92957 
Los  Angeles  CA  90009 
213-648-5890 

David  Reilly 

Hughes  Aircraft  Company 

Box  3310 

Fullerton  CA  92634 
714-871-3232 

Donald  Roberts 
RADC/ISIS 

Griff iss  AFB  NY  13441 
315-330-7546 

Dr.  Robinson 
Link  Hall 

Syracuse  University 
Syracuse  NY  13441 
315-423-3159 

Raymond  Rubey 
SOFTLdi 

4130  Linden  Ave 
Dayton  Of!  45432 
513-252-2834 

Sabina  Saib 
General  Re: search  Corp. 
Santa  Barbara  CA  93105 
805-964-7724 


418 


Steve  Schanzer 
Analytic  Services 
5613  Leesburg  Pike 
Falls  Church  VA  22041 
703-620-2830 

Val  Schorre 

Systems  Development  Corp. 

2500  Colorado  Ave 
Santa  Monica  CA  90406 
213-829-1146 

Frank  Sliwa 
RADC/ISIM 

Griffiss  AFB  NY  13441 
315-330-7011 

Dr.  Jay  Spitzen 
S.R.I. 

333  Ravenswood  Ave 
Menlo  Park  CA  94025 
415-326-6200  X3044 

Robert  Stover 
RADC/ISIS 

Griffiss  AFB  NY  13441 
315-330-7010 

bean  Stucki 

McDonnell- Douglas  Astronautics 
5301  Balsa  Awe 
Huntington  beacn  CA  92647 
714- 896-3774 

Capt  Alan  Sukert 
RADC/ISIS 

Griffiss  AFB  NY  13443 
315-330-4325 


Capt  W.  White 
ESD/MCIT 

Hansoom  AFB  MA  01731 
617-861-5391 

Russell  Wilson 
SOFTECH 
81  Avalon  Road 
Newton  MA  02168 
617-890-69 00 

Richard  Weishert 
General  Research  Corp. 
5383  Hollister  Awe 
P.C.  Box  3587 
Santa  Barbara  CA  93105 
805-964-7728 

Dr.  Steven  Yau 
Dept  of  Comp  Sci 
Northwestern  University 
Evanston  IL  60201 
312-492-3641 

Donald  Young 
ESD/XRI 

I Ians  com  AFB  MA 
617-271-2756 

Selby  Young 
Kappa  Systems  Inc. 

1409  Potter  Dr. 

Colorado  Sprinps  CO  80909 
303-597-1900 


Windsor  Thomas 
RADC/OSCP 

Griffiss  AFB  NY  13441 
315-330-4481 

Judy  Tcwnley 
Aiken  Comp  i.«ab 
Harvard  University 
Cambridge  MA  02138 
617-495-3751 


419 


